My Oracle Support Banner

Cloudera Manager reports "Bad" cluster health after Regenerating Kerberos Credentials - Canary test fails to create parent directory on BDA (Doc ID 1671101.1)

Last updated on OCTOBER 18, 2019

Applies to:

Big Data Appliance Integrated Software - Version 2.5.0 and later
Linux x86-64

Symptoms

NOTE: In the examples that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only).  Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

  

Cloudera Manager shows "Bad" for the cluster after regenerating Kerberos Credentials in Cloudera Manager using the steps in: Steps to Regenerate Kerberos Credentials thru Cloudera Manager (Doc ID 1614191.1).  From the CM logging Alert Summary shows:

ALERT_SUMMARY The health of service hdfs has become bad.

 Full message is:

The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.

Service: hdfs Category: HEALTH_CHECK Collapse

ALERT_SUMMARY The health of service hdfs has become bad.

BAD_TEST_RESULTS 1

CATEGORY HEALTH_CHECK

CLUSTER <CLUSTER_NAME>

CLUSTER_ID 1

CURRENT_COMPLETE_HEALTH_TEST_RESULTS {"content":"The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.","<TEST_NAME>":"HDFS_CANARY_HEALTH", "messageCodes":["HEALTH_TEST_BAD","HEALTH_TEST_HDFS_CANARY_FAIL_MKDIR_RESULT"], "eventCode":"EV_SERVICE_HEALTH_CHECK_BAD","severity":"CRITICAL"},{"content": "The health test result for HDFS_HA_NAMENODE_HEALTH has become good: The active NameNode is running with good health.", "<TEST_NAME>":"HDFS_HA_NAMENODE_HEALTH","messageCodes":

  

Additionally:

1. The hdfs service reports:

The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for
/tmp/.cloudera_health_monitoring_canary_files.

2.Logs under /var/log/hadoop-hdfs show additional error messages like:

ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:<user>@EXAMPLE.COM (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied by sticky bit setting: user=<user>, inode="/tmp/.cloudera_health_monitoring_canary_files":hdfs:supergroup:drwxrwxrwx


ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:<user>@EXAMPLE.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.StandbyException: Operation category READ is not supported in state standby

  

3. If you manually delete /tmp/.cloudera_health_monitoring_canary_files you can not recreate it.

4. Trying to change permissions/sticky bit/etc. does not resolve the problem.

Changes

 Happens after regenerating Kerberos Credentials in Cloudera Manager.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.