My Oracle Support Banner

Oracle Big Data Appliance Hardware/Software Resiliency Related Frequently Asked Questions (FAQ) (Doc ID 1588788.1)

Last updated on MAY 29, 2021

Applies to:

Big Data Appliance Integrated Software
Big Data Appliance X3-2 Starter Rack - Version All Versions and later
Big Data Appliance X3-2 Full Rack - Version All Versions and later
Big Data Appliance X3-2 In-Rack Expansion - Version All Versions and later
Big Data Appliance X3-2 Hardware - Version All Versions and later
Linux x86-64

Purpose

This document provides answers to frequently asked Hardware / Software Resiliency related questions on Oracle Big Data Appliance (BDA). 

Questions and Answers

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Purpose
Questions and Answers
 What is the expectation in terms of the Hadoop cluster in the case of bringing down two critical nodes for example the Active NameNode and Active JobTracker at the same time?
 Is data loss expected in the case of two critical nodes going down? For example in the case above of both the Active NameNode and Active JobTracker going down is data loss expected?
 Can BDA withstand single point of failure?
 When Node 3 goes down, Cloudera Manager(CM) service does too.  Is there a way to move the CM service to another node?
 When Node 3 goes down, Cloudera Manager(CM) service does too. Are there command line options to check the state of the cluster?
 If any concerns about Node 3 is it possible to move Cloudera Manager  directly to another node?
 What about the case of a server/disk being permanently disabled?
 Is there any way for data loss to happen?
 How long does it take for NameNode to replicate the data once a DataNode goes down?
 If under-replicated blocks start being replicated after 10 minutes and 30 seconds when a Node is shutdown will that replication impact cluster load or bandwidth usage?
 How long would a DataNode need to be down for before an increase in network traffic would be realized?
 If a DataNode is going to be down to close to or over a day are there any recommendations to minimize performance?
 Is it advisable to change the defaults for: dfs.namenode.heartbeat.recheck-interval and dfs.heartbeat.interval when determining whether a datanode is dead?
 What other options are available to protect data in the case of multiple permanent server loss and other disaster scenarios?
 If a non-critical server, for example one of servers 5-18 on a full rack, is taken down for service will that negatively impact cluster?
 For HA testing is it possible to relocate Hive services to a different node after a Hive node failure?
 On BDA V4.2 is the Cloudera Manager HA  model suggested in the Cloudera CDH 5.4 document :Prerequisites for Setting up Cloudera Manager High Availability supported?
 On BDA V4.2 Hive Metastore HA supported?
 On BDA v4.3.0 or higher is HiveServer2 HA supported?

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.