BDA Upgrade to BDA V3*/V4*-CDH 5.* fails on Step 7 Restarting Hadoop Services with: This NameNode has an unfinalized metadata upgrade (Doc ID 1928590.1)

Last updated on JUNE 22, 2017

Applies to:

Big Data Appliance Integrated Software - Version 3.0 and later
Linux x86-64

Symptoms

BDA Upgrade to 3.0.1 from 2.5 fails at Step 7 Restarting Hadoop Services.  In this case the failure occurred on the first node of the second rack as below.  It could occur on any node. The failure looks like:

INFO: Executing Upgrade Steps 7 through 8
INFO: Step 7: Restarting Hadoop Services
...
ERROR: Puppet agent run on node <bdanode01-rack2> had errors. List of errors follows

************************************
Error [8436]: (//<bdanode01-rack2>.example.com//Stage[main]/Hadoop::Addyarn/Exec[add_yarn]/returns)
Change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.out returned 1 instead of one of [0]
at /opt/oracle/BDAMammoth/puppet/modules/hadoop/manifests/addyarn.pp:46
...


Other Symptoms:

1. In Cloudera Manager one NameNode is up and healthy.  One NameNode is down.

2. The associated log file for the node in question shows output like: 

<bdanode01-rack2>: Sep 19 13:29:48 <bdanode01-rack2> puppet-master[8436]: Compiled catalog for
<bdanode01-rack2>.example.com in environment production in 0.37 seconds


Sep 19 13:38:49 <bdanode01-rack2> puppet-master[8436]:
(//<bdanode01-rack2>.example.com//Stage[main]/Hadoop::Addyarn/File[add_yarn]/content)
content changed '{md5}7ffda6a37f7b01469696ee8f820b35a4' to '{md5}c56696fe56fc5e82c1beeca0bb430e28'


Sep 19 13:38:49 <bdanode01-rack2> puppet-master[8436]:
(//<bdanode01-rack2>.example.com//Stage[main]/Hadoop::Addyarn/Exec[add_yarn]/returns) change from notrun to 0 failed:
/opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.sh &> /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.out returned 1
instead of one of [0] at /opt/oracle/BDAMammoth/puppet/modules/hadoop/manifests/addyarn.pp:46


Sep 19 13:38:49 <bdanode01-rack2> puppet-master[8436]: (//<bdanode01-rack2>.example.com//Stage[post]/Miscsetup::Endstep/Notify[end_step])
Dependency Exec[add_yarn] has failures: true


3. The associated addyarn.out file in /opt/oracle/BDAMammoth/bdaconfig/tmp/addyarn.out shows output like:

<bdanode01-rack2>:
Succeeded. Output in : /opt/oracle/BDAMammoth/bdaconfig/tmp/cm_commands.out
<bdanode01-rack2>:
API Version used is v6
<bdanode01-rack2>: Succeeded. Output in :
/opt/oracle/BDAMammoth/bdaconfig/tmp/clusters_<cluster_name>_commands_upgradeServices.out
<bdanode01-rack2>:
Command ID is 13956
<bdanode01-rack2>: ..........................................................................................
..........Operation timed out.


4. The associated clusters_node_commands_upgradeServices.out file  in /opt/oracle/BDAMammoth/bdaconfig/tmp/clusters_<cluster_name>_commands_upgradeServices.out shows output like:

<bdanode01-rack2>: {
<bdanode01-rack2>: "id" : 13956,
<bdanode01-rack2>: "name" : "UpgradeCluster",
<bdanode01-rack2>:
"startTime" : "2014-09-19T18:30:20.262Z",
<bdanode01-rack2>: "active" : true,
<bdanode01-rack2>: "clusterRef" :
{
<bdanode01-rack2>: "clusterName" : "cluster"
<bdanode01-rack2>: }
<bdanode01-rack2>: }


5. Rechecking the associated errors on CM:

a) From the:  All Recent Commands, tab, the Upgrade Cluster step is successfully completed:

Upgrade Cluster: successfully upgraded cluster
Start: Service started successfully for mgmt services

The following are completed:

Upgrade HDFS Metadata
Upgrade Hive Metastore Database
Upgrade oozie database
Install oozie shared lib


b) The: All Health Issues, tab shows:

Name node: This NameNode has an unfinalized metadata upgrade

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms