BDA Node 1 Migration Fails at Step 11 - Both NameNodes in Standby and "Failed to create NodeManager remote application log directory" Errors (Doc ID 1957275.1)

Last updated on OCTOBER 27, 2016

Applies to:

Big Data Appliance Integrated Software - Version 4.0 and later
Linux x86-64

Symptoms

1. BDA Node 1 migration fails at step 11 with:

ERROR: Puppet agent run on node bdanode03 had errors. List of errors follows
************************************
Error [379]: (//bdanode03.example.com//Stage[main]/Hadoop::Startsvc2/Exec[setup_scm]/returns)
change from notrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm.sh &>
/opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_1419003491.out returned 1 instead of one of [0]
************************************


2. From the analysis below we find that Node 2 is in Standby so no Active NameNode as Node 1 has migrated to Node 6 and that is in Standby too:

a) /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_1419003491.out shows:

Command 3423 finished after 95 seconds
Operation failed
Result Message is:   "Failed to create NodeManager remote application log directory.",


b) Drilling down into /opt/oracle/BDAMammoth/bdaconfig/tmp/commands_3423.out shows:

{
  "id" : 3423,
  "name" : "CreateLogDir",
  "startTime" : "2014-12-19T16:00:00.291Z",
  "endTime" : "2014-12-19T16:01:34.990Z",
  "active" : false,
  "success" : false,
  "resultMessage" : "Failed to create NodeManager remote application log directory.",
  "serviceRef" : {
    "clusterName" : "<cluster_name>",
    "serviceName" : "yarn"
  },
  "children" : {
    "items" : [ {
      "id" : 3424,
      "name" : "CreateDir",
      "startTime" : "2014-12-19T16:00:00.342Z",
      "endTime" : "2014-12-19T16:01:34.988Z",
      "active" : false,
      "success" : false,
      "resultMessage" : "Command aborted because of exception: Command timed-out after 90 seconds",
      "serviceRef" : {
        "clusterName" : "<cluster_name>",
        "serviceName" : "hdfs"
      },
      "roleRef" : {
        "clusterName" : "<cluster_name>",
        "serviceName" : "hdfs",
        "roleName" : "hdfs-NAMENODE-80db8a0b02f500ffe42d890e712b232a"
      }
    } ]


c)  Cloudera Manager (CM) > Commands shows:

Create NodeManager Remote Application Log Directory
Failed to create NodeManager remote application log directory

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms