My Oracle Support Banner

Yarn Service in "Bad Health" / JobHistory Server in "Bad Health" after BDA V3.0.1 New Install- can not create /user/history (Doc ID 1903811.1)

Last updated on NOVEMBER 04, 2019

Applies to:

Big Data Appliance Integrated Software - Version 3.0.1 and later
x86_64

Symptoms

NOTE: In the examples that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

  

During the Oracle Big Data Appliance V3.0.1 Mammoth Install, Step 11 fails as described in: Oracle Big Data Appliance V3.0.1 Mammoth Install Step 11 Fails - NameNodes report: NameNode is not formatted (Doc ID 1903785.1)

Once the issue is resolved and the Mammoth 3.0.1 install completes, a) The Mammoth cluster validations report errors and b) Cloudera Manager(CM) shows the 'yarn' service in "Bad Health".

1. The Step 17, Mammoth install CleanupInstall step reports errors:

 ERROR: Errors in BDA cluster check
 ERROR: Errors in TeraSort run
 ERROR: Errors in Oozie workflow run

bdacheckcluster reports:

ERROR: yarn is in bad health

 

2. Looking at the 'yarn' service shows:
a) NodeManager is healthy
b) ResourceMangagers are healthy
c) JobHistory Server, not healthy.


The JobHistory logs show that /user/history/done can not be created:

This is observed from the JobHistory Server logs in CM and from the logs on the JobHistory node (node 3) under:

/var/log/hadoop-mapreduce/hadoop-cmf-yarn-JOBHISTORY-bdanode03.example.com.log.out

The log reports:

014-06-30 18:34:19,292 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager
     failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
     Error creating done directory: [hdfs://<CLUSTER_NAME>-ns:8020/user/history/done]
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Error creating done directory:
[hdfs://<CLUSTER_NAME>-ns:8020/user/history/done]
       at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.tryCreatingHistoryDirs(HistoryFileManager.java:587)
       at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.createHistoryDirs(HistoryFileManager.java:533)
       at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.serviceInit(HistoryFileManager.java:501

and

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
   Permission denied: user=mapred, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x

 

Note: This can be a secure or non-secure system.


3. Looking at the directories in HDFS shows that /user/history has not been created:

  

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.