Mammoth Migration StartHadoopServices Puppet Actions Loop Continuously Causing Cloudera Manager Services to Restart Over and Over
(Doc ID 2861322.1)
Last updated on JULY 20, 2024
Applies to:
Big Data Appliance Integrated Software - Version 4.13.0 and laterLinux x86-64
Symptoms
Performing a mammoth migration for the host with Cloudera Manager (CM) role, Node 3 by default, following:
Node 3 Migration and Recommission on Oracle Big Data Appliance V4.11 and Higher (Doc ID 2524859.1)
or
Node 3 Migration and Reprovision on Oracle Big Data Appliance V4.1 OL6 Hadoop Cluster to Manage a Hardware Failure (Doc ID 1984854.1)
fails in Step 4, StartHadoopServices, with errors like below indicating that CM services are being restarted and that the puppet agent on the target host timed out.
...
************************************
Error [71173]: (//<TARGET_HOSTNAME3>.<DOMAIN>//Stage[main]/Hadoop::Startsvc2/Exec[setup_scm]/returns) change from notrun to 0 failed:
/opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_<TIMESTAMP>.out returned 1 instead of one of [0]
************************************
...
ERROR: Puppet script had errors.
ERROR: Puppet agent on node <TARGET_HOSTNAME3> is not reachable
Further investigation on the target host with the commands below shows that setupscm.sh is continuously running.
Note: In the case here setupscm.sh is continuously running. Another script could experience the same issue. It is also the case that mammoth actions other than migration can encounter a similar scenario.
1. setupscm.sh is the script which starts hadoop services in CM. The script is being continuously executed. This causes CM services to be restarted over and over.
2. Following, Troubleshooting Mammoth Errors: "ERROR: Puppet agent on node bdanode0x is not reachable" (Doc ID 2164227.1), and killing the running process does not resolve the problem. The script setupscm.sh continues to run.
3. On the Mammoth node, checking /opt/oracle/BDAMammoth/puppet/manifests/site.pp, shows it contains cached puppet commands which invoke the script continuously.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |