Cloudera Manager reports "Bad" cluster health after Regenerating Kerberos Credentials - Canary test fails to create parent directory on BDA
(Doc ID 1671101.1)
Last updated on OCTOBER 18, 2019
Applies to:
Big Data Appliance Integrated Software - Version 2.5.0 and laterLinux x86-64
Symptoms
Cloudera Manager shows "Bad" for the cluster after regenerating Kerberos Credentials in Cloudera Manager using the steps in: Steps to Regenerate Kerberos Credentials thru Cloudera Manager (Doc ID 1614191.1). From the CM logging Alert Summary shows:
Full message is:
The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.
Service: hdfs Category: HEALTH_CHECK Collapse
ALERT_SUMMARY The health of service hdfs has become bad.
BAD_TEST_RESULTS 1
CATEGORY HEALTH_CHECK
CLUSTER <CLUSTER_NAME>
CLUSTER_ID 1
CURRENT_COMPLETE_HEALTH_TEST_RESULTS {"content":"The health test result for HDFS_CANARY_HEALTH has become bad: Canary test failed to create parent directory for /tmp/.cloudera_health_monitoring_canary_files.","<TEST_NAME>":"HDFS_CANARY_HEALTH", "messageCodes":["HEALTH_TEST_BAD","HEALTH_TEST_HDFS_CANARY_FAIL_MKDIR_RESULT"], "eventCode":"EV_SERVICE_HEALTH_CHECK_BAD","severity":"CRITICAL"},{"content": "The health test result for HDFS_HA_NAMENODE_HEALTH has become good: The active NameNode is running with good health.", "<TEST_NAME>":"HDFS_HA_NAMENODE_HEALTH","messageCodes":
Additionally:
1. The hdfs service reports:
/tmp/.cloudera_health_monitoring_canary_files.
2.Logs under /var/log/hadoop-hdfs show additional error messages like:
ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:<user>@EXAMPLE.COM (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied by sticky bit setting: user=<user>, inode="/tmp/.cloudera_health_monitoring_canary_files":hdfs:supergroup:drwxrwxrwx
ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:<user>@EXAMPLE.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.StandbyException: Operation category READ is not supported in state standby
3. If you manually delete /tmp/.cloudera_health_monitoring_canary_files you can not recreate it.
4. Trying to change permissions/sticky bit/etc. does not resolve the problem.
Changes
Happens after regenerating Kerberos Credentials in Cloudera Manager.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |