During Oracle Big Data Appliance Upgrade from V2.5 to V3.0.*, Mammoth Fails on Step7 while Upgrading Oozie Database
(Doc ID 1904248.1)
Last updated on DECEMBER 05, 2021
Applies to:
Big Data Appliance Integrated Software - Version 3.0.1 and laterLinux x86-64
Symptoms
On Oracle Big Data Appliance during upgrade from 2.5 to 3.0.* release, Step 7 failes with the following error:
************************************
Error [8413]: (//bdanode01.example.com//Stage[main]/Hadoop::Addyarn/Exec[add_yarn]/returns) change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.sh &> /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.out returned 1 instead of one of [0] at /opt/oracle/BDAMammoth/puppet/modules/hadoop/manifests/addyarn.pp:46
1) Checking in the latest commands_<id>.out file in /opt/oracle/BDAMammoth/bdaconfig/tmp directory shows that UpgradeCluster command failed due to the child command 'UpgradeOozieCommand' failing
{
"id" : <ID>,
"name" : "UpgradeCluster",
.......
"active" : false,
"success" : false,
"resultMessage" : "Failed to upgraded cluster",
"clusterRef" : {
"clusterName" : "cluster"
},
..... {
"id" : <ID>,
"name" : "UpgradeOozieCommand",
...........
"active" : false,
"success" : false,
"resultMessage" : "Failed to upgrade Oozie service.",
"serviceRef" : {
"clusterName" : "cluster",
"serviceName" : "oozie"
}
2) After this, log into Cloudera Manager(CM) and in Home ---> click on 'All Recent Commands '. Click on 'UpgradeCluster command' which failed, to check what command failed and what commands still need to be executed after the failed command.
Note:- Take a snapshot of this screen as the command that failed and the commands that are listed after the failed command need to be executed manually in CM. The reason for this is that once the 'UpgradeCluster command' has been executed,, CM will not re-run the Upgrade command .
In below snapshot, Oozie upgrade failed and the commands that still need to be executed are upgrading Sqoop2 and upgrading Hive
Clicking on the failed 'Upgrading Oozie Database ..' command and reviewing the log shows below error
java.lang.Exception: There are [<NUM>] workflows in RUNNING/SUSPENDED state, they must complete or be killed
at org.apache.oozie.tools.OozieDBCLI.verifyDBState(OozieDBCLI.java:1028)
at org.apache.oozie.tools.OozieDBCLI.upgradeDB(OozieDBCLI.java:199)
at org.apache.oozie.tools.OozieDBCLI.run(OozieDBCLI.java:130)
at org.apache.oozie.tools.OozieDBCLI.main(OozieDBCLI.java:78)
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |