Upgrade Of BDA Cluster to 4.* Fails at Step 7 Running "mammoth -p" Command with "Hadoop::Upgradeactions/Exec[set_cluster_conf]"

(Doc ID 2044936.1)

Last updated on AUGUST 24, 2015

Applies to:

Big Data Appliance Integrated Software - Version 4.1.0 to 4.2.0 [Release 4.1 to 4.2]
Linux x86-64

Symptoms

Running "mammoth -p" command on upgrade to 4.* , step 7 failed with:

ERROR: Puppet agent run on node bda1node01 had errors. List of errors follows
************************************
Error [12527]: (//bda1node01.example.com//Stage[main]/Hadoop::Upgradeactions/Exec[set_cluster_conf]/returns) change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/setclusterconf.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/setclusterconf.out returned 1 instead of one of [0]
************************************


Checking in CM and all recent commands found host monitor failed to start with this message:

Service did not start successfully; not all of the required roles started: Service has only 0 Host Monitor roles running instead of minimum required 1.

 

Checking in CM > mgmt > Host Monitor is showing down.

From /opt/oracle/BDAMammoth/bdaconfig/tmp/cm_service_commands_restart.out:

Command ID is 38555
...........
Command 38555 finished after 60 seconds
Operation failed
Result Message is:   "Failed to restart service.",

/opt/oracle/BDAMammoth/bdaconfig/tmp/cm_service_commands_restart.out

shows:
"id" : 38555,
 "name" : "Restart",
 "startTime" : "2015-08-14T20:17:39.501Z",
 "active" : true,
 "serviceRef" : {
   "serviceName" : "mgmt"
 }


Host Monitor logs showed errors about missing partitions in the Host Monitor storage directory about LDBTimeSeries:

Host Monitor logs showed errors about missing partitions in the Host Monitor storage directory...

2015-08-14 11:34:12,879 ERROR com.cloudera.enterprise.dbpartition.PartitionManager: Error expiring partitions: /var/lib/cloudera-host-monitor/ts/type/partitions/type_2015-06-30T23:05:26.840Z does not exist
java.lang.IllegalArgumentException: /var/lib/cloudera-host-monitor/ts/type/partitions/type_2015-06-30T23:05:26.840Z does not exist
at com.cloudera.cmf.FileUtils2.checkDirectory(FileUtils2.java:75)
at com.cloudera.cmf.FileUtils2.sizeOfDirectory(FileUtils2.java:41)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getPartitionSizeInBytes(LDBPartitionManager.java:670)
at com.cloudera.cmon.tstore.leveldb.LDBSizeBasedPartitionPolicy.getPartitionsToExpire(LDBSizeBasedPartitionPolicy.java:157)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.expirePartitions(LDBPartitionManager.java:974)
at com.cloudera.enterprise.dbpartition.PartitionManager.runPartitionManagement(PartitionManager.java:142)
at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesTable.forcePartitionManagement(LDBTimeSeriesTable.java:131)
at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStore.forcePartitionManagement(LDBTimeSeriesStore.java:610)
at com.cloudera.cmon.firehose.Firehose.<init>(Firehose.java:105)
at com.cloudera.cmon.firehose.Main.main(Main.java:527)

 

 

Changes

 Upgrade to 4.2.0.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms