My Oracle Support Banner

BDA Node 2 Reprovision Fails at Step 11 - Yarn down, NodesManager Fails to Come up with "First error: -8192 is less than the minimum allowed value 0." (Doc ID 1957237.1)

Last updated on DECEMBER 10, 2019

Applies to:

Big Data Appliance Integrated Software - Version 4.0 and later
Linux x86-64

Symptoms

NOTE: In the images, examples and document that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner. 

 
1. BDA Node 2 Migration fails at step 11 with:

Error [1171]: (//bdanode04.example.com//Stage[main]/Hadoop::Startsvc2/Exec[setup_scm]/returns) change from n\
otrun to 0 failed: /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm.sh &> /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_1418944098.out returned 1 instead of one of [0]


2. The /opt/oracle/BDAMammoth/bdaconfig/tmp/setupscm_1418944098.out file shows the yarn service can not start: "Failed to start service"

Operation failed
Result Message is:   "Failed to restart service.",
API Version used is v7
Succeeded. Output in : /opt/oracle/BDAMammoth/bdaconfig/tmp/clusters_gdch01-cluster_services_yarn_commands_start.out
Command ID is 11173
....
Command 11173 finished after 25 seconds
Operation failed
Result Message is:   "Failed to start service.",


3. On Node 4, commands_11173.out, again reports "Failed to start service" for yarn.

# cat commands_11173.out
{
  "id" : 11173,
  "name" : "YarnOrderedStart",
 
"startTime" : "2014-12-18T23:22:12.089Z",
  "endTime" : "2014-12-18T23:22:35.256Z",
  "
active" : false,
  "success" : false,
  "resultMessage" : "Failed to start service.",
 
"serviceRef" : {
    "clusterName" : "<cluster_name>-cluster",
    "serviceName" : "yarn"
  },
 
"children" : {
    "items" : [ {
      "id" : 11174,
      "name" : "Start",
     
 "startTime" : "2014-12-18T23:22:12.118Z",
      
"endTime" : "2014-12-18T23:22:35.255Z",
      "active" : false,
    
  "success" : false,
      "resultMessage" :
"Command completed with 16/17 successful subcommands",
     
 "serviceRef" : {
        "clusterName" : "<cluster_name>-cluster",
      
  "serviceName" : "yarn"
      }
    } ]
  }


4. Cloudera Manager (CM) shows the yarn service down and CM > All Recent Commands shows "Failed to start service" for yarn restart.

5. CM shows both Resource Managers (Nodes 4/5) are down.  The CM logs show:

For Node 4 Resource Manager

5:08:25.145 PM     
ERROR     org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Returning, interrupted : java.lang.InterruptedException
5:08:25.146 PM     
INFO     org.apache.hadoop.yarn.util.AbstractLivelinessMonitor AMLivelinessMonitor thread interrupted
5:08:25.146 PM     
INFO     org.apache.hadoop.yarn.util.AbstractLivelinessMonitor AMLivelinessMonitor thread interrupted
5:08:25.146 PM     
ERROR     org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManagerExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
5:08:25.146 PM     
INFO     org.apache.hadoop.yarn.util.AbstractLivelinessMonitor org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer thread interrupted
5:08:25.150 PM     
INFO     org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Transitioned to standby state
5:08:25.150 PM     
INFO     org.apache.hadoop.yarn.server.resourcemanager.ResourceManager SHUTDOWN_MSG:


For Node 5 Resource Manager

5:07:32.521 PM     
INFO     org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger USER=yarn    OPERATION=transitionToStandby TARGET=RMHAProtocolService    RESULT=SUCCESS
5:08:24.990 PM     
ERROR     org.apache.hadoop.yarn.server.resourcemanager.ResourceManager RECEIVED SIGNAL 15: SIGTERM


6. Both Resource Managers can be started manually in CM, however, the NodeManager on Node 2 fails to come up with CM > Commands reporting:

Command failed to run because this role has invalid configuration. Review and correct its configuration.  First error: -8192 is less than the minimum allowed value 0.





7. The NodeManager on Node 2 is observed to have property: "yarn.nodemanager.resource.memory-mb" set to -8192.





Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.