My Oracle Support Banner

On Oracle Big Data Appliance BDA 4.1/CDH5.3.0, Yarn NodeManager(s) Crash Intermittently (Doc ID 2024032.1)

Last updated on AUGUST 03, 2021

Applies to:

Big Data Appliance Integrated Software - Version 4.1.0 and later
Linux x86-64

Symptoms

On Oracle Big Data Appliance , yarn Nodemanager(s) go down intermittently say once a week.

Below errors noticed in hadoop-cmf-yarn-NODEMANAGER-<BDANode>.log.out.gz.

2015-06-02 01:11:21,166 ERROR org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: Failed to setup application log directory for application_<ID>
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 210327 for <NAME>) can't be found in cache
  at org.apache.hadoop.ipc.Client.call(Client.java:1411)
  at org.apache.hadoop.ipc.Client.call(Client.java:1364)
...
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token <TOKEN_NUM> for <NAME>) can't be found in cache
  at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:301)
...
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token <TOKEN_NUM> for <NAME>) can't be found in cache
  at org.apache.hadoop.ipc.Client.call(Client.java:1411)
....
2015-06-02 01:11:21,238 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Log Aggregation service failed to initialize, there will be no logs for this application
2015-06-02 01:11:21,240 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Removing uninitialized application application_<ID>
2015-06-02 01:11:21,241 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_<ID>
2015-06-02 01:11:21,245 ERROR org.apache.hadoop.yarn.server.nodemanager.security.NMTokenSecretManagerInNM: No application Attempt for application : application_<ID> started on this NM.
...

2015-06-02 01:11:21,303 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Downloading public rsrc:{ hdfs://<CLUSTER_NAME>-ns/user/<PATH>l/application/, <ID>, FILE, null }
2015-06-02 01:11:21,309 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.IllegalArgumentException: Can not create a Path from an empty string
  at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
  at org.apache.hadoop.fs.Path.(Path.java:135)
  at org.apache.hadoop.fs.Path.(Path.java:94)
  at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:420)
  at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:773)
  at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:687)
  at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:629)
  at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.