My Oracle Support Banner

During Oracle Big Data Appliance Upgrade from V2.5 to V3.0.*, Mammoth Fails on Step7 while Upgrading Oozie Database (Doc ID 1904248.1)

Last updated on DECEMBER 05, 2021

Applies to:

Big Data Appliance Integrated Software - Version 3.0.1 and later
Linux x86-64

Symptoms

NOTE: In the examples that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner. 

On Oracle Big Data Appliance during upgrade from 2.5 to 3.0.* release, Step 7 failes with the following error:

ERROR: Puppet agent run on node bdanode01 had errors. List of errors follows

************************************
Error [8413]: (//bdanode01.example.com//Stage[main]/Hadoop::Addyarn/Exec[add_yarn]/returns) change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.sh &> /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.out returned 1 instead of one of [0] at /opt/oracle/BDAMammoth/puppet/modules/hadoop/manifests/addyarn.pp:46

1) Checking in the latest commands_<id>.out file in /opt/oracle/BDAMammoth/bdaconfig/tmp directory shows that UpgradeCluster command failed due to the child command 'UpgradeOozieCommand' failing

{
 "id" : <ID>,
 "name" : "UpgradeCluster",
.......
 "active" : false,
 "success" : false,
 "resultMessage" : "Failed to upgraded cluster",
 "clusterRef" : {
   "clusterName" : "cluster"
 },

 .....   {

"id" : <ID>,
     "name" : "UpgradeOozieCommand",
   ...........
     "active" : false,
     "success" : false,
     "resultMessage" : "Failed to upgrade Oozie service.",
     "serviceRef" : {
       "clusterName" : "cluster",
       "serviceName" : "oozie"
     }

2) After this, log into Cloudera Manager(CM) and in Home ---> click on 'All Recent Commands '. Click on 'UpgradeCluster command' which failed, to check what command failed and what commands still need to be executed after the failed command.

Note:- Take a snapshot of this screen as the command that failed and the commands that are listed after the failed command need to be executed manually in CM.  The reason for this is that once the 'UpgradeCluster command' has been executed,, CM will not re-run the Upgrade command .

In below snapshot, Oozie upgrade failed and the commands that still need to be executed are upgrading Sqoop2 and upgrading Hive

Clicking on the failed 'Upgrading Oozie Database ..' command and reviewing the log shows below error

java.lang.Exception: There are [<NUM>] workflows in RUNNING/SUSPENDED state, they must complete or be killed

at org.apache.oozie.tools.OozieDBCLI.verifyDBState(OozieDBCLI.java:1028)

at org.apache.oozie.tools.OozieDBCLI.upgradeDB(OozieDBCLI.java:199)

at org.apache.oozie.tools.OozieDBCLI.run(OozieDBCLI.java:130)

at org.apache.oozie.tools.OozieDBCLI.main(OozieDBCLI.java:78)

 

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.