Cloudera Manager reports "Bad" cluster health after Regenerating Kerberos Credentials - Canary test fails to create parent directory on BDA (Doc ID 1671101.1)

Last updated on MAY 30, 2014

Applies to:

Big Data Appliance Integrated Software - Version 2.5.0 and later
Linux x86-64

Symptoms

Cloudera Manager shows "Bad" for the cluster after regenerating Kerberos Credentials in Cloudera Manager using the steps in: Steps to Regenerate Kerberos Credentials thru Cloudera Manager (Doc ID 1614191.1).  From the CM logging Alert Summary shows:

ALERT_SUMMARY The health of service hdfs has become bad.

 Full message is:

The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.

Service: hdfs Category: HEALTH_CHECK Collapse

ALERT_SUMMARY The health of service hdfs has become bad.

BAD_TEST_RESULTS 1

CATEGORY HEALTH_CHECK

CLUSTER <cluster_name>

CLUSTER_ID 1

CURRENT_COMPLETE_HEALTH_TEST_RESULTS {"content":"The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.","testName":"HDFS_CANARY_HEALTH", "messageCodes":["HEALTH_TEST_BAD","HEALTH_TEST_HDFS_CANARY_FAIL_MKDIR_RESULT"], "eventCode":"EV_SERVICE_HEALTH_CHECK_BAD","severity":"CRITICAL"},{"content": "The health test result for HDFS_HA_NAMENODE_HEALTH has become good: The active NameNode is running with good health.", "testName":"HDFS_HA_NAMENODE_HEALTH","messageCodes":

  

Additionally:

1. The hdfs service reports:

The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for
/tmp/.cloudera_health_monitoring_canary_files.

2.Logs under /var/log/hadoop-hdfs show additional error messages like:

ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:oracle@EXAMPLE.COM (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied by sticky bit setting: user=oracle, inode="/tmp/.cloudera_health_monitoring_canary_files":hdfs:supergroup:drwxrwxrwx


ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:oracle@EXAMPLE.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.StandbyException: Operation category READ is not supported in state standby

  

3. If you manually delete /tmp/.cloudera_health_monitoring_canary_files you can not recreate it.

4. Trying to change permissions/sticky bit/etc. does not resolve the problem.

Changes

 Happens after regenerating Kerberos Credentials in Cloudera Manager.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms