On Oracle Big Data Appliance, how does Resource Manager Choose a Node to Launch Application Master in Case of Failure?
(Doc ID 2114362.1)
Last updated on JULY 05, 2021
Applies to:
Big Data Appliance Integrated Software - Version 4.1.0 to 4.3.0 [Release 4.1 to 4.3]Linux x86-64
Goal
On Big Data Appliance, yarn jobs that are failing because the local directory where the job is launched is full and is expected. But the issue the jobs are being reattempted on the same node even after the first failure attempt
Application application_1448106546957_15089 failed 2 times due to AM Container for appattempt_1448106546957_15089_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://<RM>:8088/proxy/application_1448106546957_15089/Then, click on links to logs of each attempt.
Diagnostics: Not able to initialize app-log directories in any of the configured local directories for app application_1448106546957_15089
Failing this attempt. Failing the application.
For more detailed output, check application tracking page:http://<RM>:8088/proxy/application_1448106546957_15089/Then, click on links to logs of each attempt.
Diagnostics: Not able to initialize app-log directories in any of the configured local directories for app application_1448106546957_15089
Failing this attempt. Failing the application.
Is there a way to configure yarn to choose a different node after 1 failure attempt, or is it random assignment of the Application Master (AM) container by the Resource Manager(RM)?
Solution
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Goal |
Solution |