BDA Upgrade Failed on Step 7 Upgrading from 3.0.1 to 4.1.0 with "Hadoop::Upgradecdh/Exec[upgrade_cdh]/returns) change from notrun to 0 failed" (Doc ID 1995601.1)

Last updated on AUGUST 18, 2015

Applies to:

Big Data Appliance Integrated Software - Version 4.1.0 and later
Linux x86-64

Symptoms

BDA Upgrade to 4.1.0 fails with error at Mammoth step 7:

ERROR: Puppet agent run on node bda01 had errors. List of errors follows

************************************
Error [11155]: (//<cluster name>-01.domain.com//Stage[main]/Hadoop::Upgradecdh/Exec[upgrade_cdh]/returns) change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/upgradecdhparcel.sh &> /opt/oracle/BDAMammoth/bdaconfig/tmp/upgradecdhparcel.out returned 1 instead of one of [0]
************************************

 

Symptoms may include:

1. Neither namenode is running.

2. Trying to bring up namenode 1 > Start namenode
In the case where trying to bring up the namenode does not allow it to start, further check the Role log details.

3. One or both namenode logs will show message like:

Failed to initialize Shared Edits Directory of NameNode namenode (<cluster name>). Initialization can fail if the Shared Edits Directory is not empty. Check the stderr log for details.: File system image contains an old layout version -55.
An upgrade to version -59 is required.
Please restart NameNode with the "-rollingUpgrade started" option if a rolling upgrade is already started; or restart NameNode with the "-upgrade" option to start a new upgrade.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:232)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1005)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:735)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initializeSharedEdits(NameNode.java:1011)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1398)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1493)
15/03/10 10:10:24 ERROR namenode.NameNode: Could not initialize shared edits dir
java.io.IOException:
File system image contains an old layout version -55.
An upgrade to version -59 is required.
Please restart NameNode with the "-rollingUpgrade started" option if a rolling upgrade is already started; or restart NameNode with the "-upgrade" option to start a new upgrade.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:232)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1005)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:735)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initializeSharedEdits(NameNode.java:1011)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1398)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1493)
15/03/10 10:10:24 INFO util.ExitUtil: Exiting with status 1
15/03/10 10:10:24 INFO namenode.NameNode: SHUTDOWN_MSG:

 

4. Check hdfs instances to see if namenodes can be started:
hdfs > Instances
Click on check boxes next to both namenodes > Actions for Selected > Start

Failover controller shows one in stopping state
Clicking on Failover
Message shows :
This role has no process associated with it.

5. Checking to see the instances in hdfs, one Failover Controller shows "stopping" state.

a. Go to hdfs instances:

CM > Home > hdfs > instances > display all

b. Check to see if there are any Failover Controllers that are not coming to a "STOPPED" state and are stuck in a "stopping" state.


Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms