Oracle Big Data Appliance "mammoth -p" Upgrade to V2.4 Release of a Secure Cluster Fails "After Enable CM email Alerts" at site.pp:18 (Doc ID 1625049.1)

Last updated on FEBRUARY 17, 2014

Applies to:

Big Data Appliance Integrated Software - Version 2.3.1 to 2.4.0 [Release 2.3 to 2.4]
Linux x86-64

Symptoms

Upgrading a Secure i.e. Kerberos enabled 2.3.1 BDA Cluster to 2.4.0 fails on the last step after:

INFO: Enable CM email Alerts. This will take some time ..

with an error like:

ERROR: Puppet agent run on node bdanode01 had errors. List of errors follows

************************************
Error [11857]: (//bdanode01.us.oracle.com//Stage[main]//Node[default]/Exec[enable_cm_alerts]/returns)
change from notrun to 0 failed: Command exceeded timeout at /opt/oracle/BDAMammoth/puppet/manifests/site.pp:18
************************************

INFO: Also check the log file in /opt/oracle/BDAMammoth/bdaconfig/tmp/pagent-bdanode01-20140213170307.log

ERROR: Puppet script had errors.
ERROR: unable to continue with install until errors are fixed #Step 8#
INFO: Time spent in step 8 CleanupInstall is 906 seconds.
Exiting...

 

Checking the log file, /opt/oracle/BDAMammoth/bdaconfig/tmp/pagent-bdanode01-20140213171600.log, shows an error like:

//bdanode01.example.com//Stage[main]//Node[default]/Exec[enable_cm_alerts]/returns)
change from notrun to 0 failed:
/opt/oracle/BDAMammoth/bdaconfig/tmp/enablecmalerts.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/enablecmalerts.out returned 1 instead of
one of [0] at /opt/oracle/BDAMammoth/puppet/manifests/site.pp:18


Looking at the FailoverController logs for HDFS and MapReduce shows a message like below saying that formatZK is necessary:

2014-02-13 17:26:23,672 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
2014-02-13 17:26:23,675 FATAL org.apache.hadoop.ha.ZKFailoverController: Unable to start failover controller. Parent znode does not exist.
Run with -formatZK flag to initialize ZooKeeper.
2014-02-13 17:26:23,677 INFO org.apache.zookeeper.ZooKeeper: Session: 0x3442de5cc3b003b closed
2014-02-13 17:26:23,677 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down


Where the FailoverController logs can be found under /var/logs on the appropriate nodes of the cluster as:
HDFS:
/var/log/hadoop-hdfs/hadoop-cmf-hdfs-FAILOVERCONTROLLER-bdanode0x.example.com.log.out

MapReduce:
/var/log /hadoop-0.20-mapreduce/hadoop-cmf-mapreduce-FAILOVERCONTROLLER-bdanode0x.example.com.log.out

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms