My Oracle Support Banner

BDA Node 3 Migration Fails at Step 10 StartHadoopServices - Both NameNodes in Standby and Command "'HdfsStartWithFailovers' failed for service 'hdfs'" (Doc ID 1982525.1)

Last updated on NOVEMBER 27, 2019

Applies to:

Big Data Appliance Integrated Software - Version 4.1.0 and later
Linux x86-64

Symptoms

1. BDA Node 3 migration fails at step 10 with:

Error [7367]: (//<HOSTNAME5>.<DOMAIN>//Stage[main]/Hadoop::Startsvc2/Exec[setup_scm]/returns)
change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm.sh &>
 /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_<ID>.out returned 1 instead of one of [0]


2. /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_<ID>.out shows:

Command 1018 finished after 90 seconds
Operation failed
Result Message is:   "Command 'Start' failed for cluster '<cluster_name>'",

3. In Cloudera Manager (CM)

a) Both NameNodes are in Standby.


b)  Cloudera Manager (CM) > Commands shows:

Command 'HdfsStartWithFailovers' failed for service 'hdfs'

Command 'Start' failed for service 'hdfs'


c) From CM details:

...
2015-02-20 14:09:55,831 FATAL org.apache.hadoop.ha.ZKFailoverController: Unable to start failover controller. Parent znode does not exist.
Run with -formatZK flag to initialize ZooKeeper.
2015-02-20 14:09:55,832 INFO org.apache.zookeeper.ZooKeeper: Session: <SESSION_ID> closed
2015-02-20 14:09:55,832 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
...

 d) From CM stderr:

+ '[' mkdir = zkfc ']'
+ '[' nfs3 = zkfc ']'
+ '[' namenode = zkfc -o secondarynamenode = zkfc -o datanode = zkfc ']'
+ exec /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop-hdfs/bin/hdfs
--config /var/run/cloudera-scm-agent/process/242-hdfs-FAILOVERCONTROLLER zkfc


e) From CM stdout:

/usr/bin/kinit
using hdfs/<HOSTNAME1>.<DOMAIN>@<REALM> as Kerberos principal
using /var/run/cloudera-scm-agent/process/<ID>-hdfs-FAILOVERCONTROLLER/krb5cc_201 as Kerberos ticket cache
Program: hdfs/hdfs.sh ["zkfc"]


f) From CM Role Details:

2:09:55.762 PM     FATAL     org.apache.hadoop.ha.ZKFailoverController     

Unable to start failover controller. Parent znode does not exist.
Run with -formatZK flag to initialize ZooKeeper.

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.