My Oracle Support Banner

Oracle Big Data Appliance "mammoth -p" Upgrade to V2.4 Release of a Secure Cluster Fails "After Enable CM email Alerts" at site.pp:18 (Doc ID 1625049.1)

Last updated on JANUARY 30, 2020

Applies to:

Big Data Appliance Integrated Software - Version 2.3.1 to 2.4.0 [Release 2.3 to 2.4]
Linux x86-64

Symptoms

NOTE: In the examples that follow, user details, table name, company name, email, hostnames, images, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

 

Upgrading a Secure i.e. Kerberos enabled 2.3.1 BDA Cluster to 2.4.0 fails on the last step after:

INFO: Enable CM email Alerts. This will take some time ..

with an error like:

ERROR: Puppet agent run on node bdanode01 had errors. List of errors follows

************************************
Error [11857]: (//bdanode01.example.com//Stage[main]//Node[default]/Exec[enable_cm_alerts]/returns)
change from notrun to 0 failed: Command exceeded timeout at /opt/oracle/BDAMammoth/puppet/manifests/site.pp:18
************************************

INFO: Also check the log file in /opt/oracle/BDAMammoth/bdaconfig/tmp/pagent-bdanode01-20140213170307.log

ERROR: Puppet script had errors.
ERROR: unable to continue with install until errors are fixed #Step 8#
INFO: Time spent in step 8 CleanupInstall is 906 seconds.
Exiting...

 

Checking the log file, /opt/oracle/BDAMammoth/bdaconfig/tmp/pagent-bdanode01-20140213171600.log, shows an error like:

//bdanode01.example.com//Stage[main]//Node[default]/Exec[enable_cm_alerts]/returns)
change from notrun to 0 failed:
/opt/oracle/BDAMammoth/bdaconfig/tmp/enablecmalerts.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/enablecmalerts.out returned 1 instead of
one of [0] at /opt/oracle/BDAMammoth/puppet/manifests/site.pp:18


Looking at the FailoverController logs for HDFS and MapReduce shows a message like below saying that formatZK is necessary:

2014-02-13 17:26:23,672 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected.
2014-02-13 17:26:23,675 FATAL org.apache.hadoop.ha.ZKFailoverController: Unable to start failover controller. Parent znode does not exist.
Run with -formatZK flag to initialize ZooKeeper.
2014-02-13 17:26:23,677 INFO org.apache.zookeeper.ZooKeeper: Session: 0x3442de5cc3b003b closed
2014-02-13 17:26:23,677 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down


Where the FailoverController logs can be found under /var/logs on the appropriate nodes of the cluster as:
HDFS:
/var/log/hadoop-hdfs/hadoop-cmf-hdfs-FAILOVERCONTROLLER-bdanode0x.example.com.log.out

MapReduce:
/var/log /hadoop-0.20-mapreduce/hadoop-cmf-mapreduce-FAILOVERCONTROLLER-bdanode0x.example.com.log.out

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
 For the hdfs service
 For the mapreduce Service

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.