On Oracle Big Data Appliance, how does Resource Manager Choose a Node to Launch Application Master in Case of Failure?
Last updated on OCTOBER 11, 2016
Applies to:Big Data Appliance Integrated Software - Version 4.1.0 to 4.3.0 [Release 4.1 to 4.3]
On Big Data Appliance, yarn jobs that are failing because the local directory where the job is launched is full and is expected. But the issue the jobs are being reattempted on the same node even after the first failure attempt
For more detailed output, check application tracking page:http://<RM>:8088/proxy/application_1448106546957_15089/Then, click on links to logs of each attempt.
Diagnostics: Not able to initialize app-log directories in any of the configured local directories for app application_1448106546957_15089
Failing this attempt. Failing the application.
Is there a way to configure yarn to choose a different node after 1 failure attempt, or is it random assignment of the Application Master (AM) container by the Resource Manager(RM)?
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms