Yarn Service in "Bad Health" / JobHistory Server in "Bad Health" after BDA V3.0.1 New Install- can not create /user/history (Doc ID 1903811.1)

Last updated on JULY 10, 2014

Applies to:

Big Data Appliance Integrated Software - Version 3.0.1 and later
x86_64

Symptoms

During the Oracle Big Data Appliance V3.0.1 Mammoth Install, Step 11 fails as described in: Oracle Big Data Appliance V3.0.1 Mammoth Install Step 11 Fails - NameNodes report: NameNode is not formatted (Doc ID 1903785.1)

Once the issue is resolved and the Mammoth 3.0.1 install completes, a) The Mammoth cluster validations report errors and b) Cloudera Manager(CM) shows the 'yarn' service in "Bad Health".

1. The Step 17, Mammoth install CleanupInstall step reports errors:

 ERROR: Errors in BDA cluster check
 ERROR: Errors in TeraSort run
 ERROR: Errors in Oozie workflow run

bdacheckcluster reports:

ERROR: yarn is in bad health

 

2. Looking at the 'yarn' service shows:
a) NodeManager is healthy
b) ResourceMangagers are healthy
c) JobHistory Server, not healthy.


The JobHistory logs show that /user/history/done can not be created:

This is observed from the JobHistory Server logs in CM and from the logs on the JobHistory node (node 3) under:

/var/log/hadoop-mapreduce/hadoop-cmf-yarn-JOBHISTORY-bdanode03.example.com.log.out

The log reports:

014-06-30 18:34:19,292 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager
     failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
     Error creating done directory: [hdfs://edw-cdh-ns:8020/user/history/done]
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory:
[hdfs://<cluster>-ns:8020/user/history/done]
       at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.tryCreatingHistoryDirs(HistoryFileManager.java:587)
       at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.createHistoryDirs(HistoryFileManager.java:533)
       at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:501

and

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
   Permission denied: user=mapred, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

 

Note: This can be a secure or non-secure system.


3. Looking at the directories in HDFS shows that /user/history has not been created:

  

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms