Cloudera Manager Randomly Generates Alerts Even to the Point of Bringing Hue and Other Services into "Bad" Health (Doc ID 2241305.1)

Last updated on MARCH 08, 2017

Applies to:

Big Data Appliance Integrated Software - Version 4.5.0 and later
Linux x86-64

Symptoms

Cloudera Manager (CM) randomly generates multiple alerts. It can get to the point that Hue and other services get into "Bad" health.

1. Alerts can look like:

Role: Hue Server (bdanode04)
HUE_SERVER_WEB_METRIC_COLLECTION Role health test bad Critical The health test result for HUE_SERVER_WEB_METRIC_COLLECTION has become bad: The Cloudera Manager Agent is not able to communicate with this role's web server.

Role: JournalNode (bdanode03)
JOURNAL_NODE_WEB_METRIC_COLLECTION Role health test bad Critical The health test result for JOURNAL_NODE_WEB_METRIC_COLLECTION has become bad: The Cloudera Manager Agent got an unexpected response from this role's web server.

Cluster: <cluster_name>
HUE_HUE_SERVERS_HEALTHY Service health test bad Critical The health test result for HUE_HUE_SERVERS_HEALTHY has become bad: Healthy Hue Server: 0. Concerning Hue Server: 0. Total Hue Server: 1. Percent healthy: 0.00%. Percent healthy or concerning: 0.00%. Critical threshold: 51.00%.

Cluster: <cluster_name>
HDFS_DATA_NODES_HEALTHY Service health test bad Critical The health test result for HDFS_DATA_NODES_HEALTHY has become bad: Healthy DataNode: 0. Concerning DataNode: 0. Total DataNode: 18. Percent healthy: 0.00%. Percent healthy or concerning: 0.00%. Critical threshold: 90.00%.

Cluster: <cluster_name>
ZOOKEEPER_CANARY_HEALTH Service health test bad Critical The health test result for ZOOKEEPER_CANARY_HEALTH has become bad: Canary test failed to establish a connection or a client session to the ZooKeeper service.

Role: JournalNode <cluster_name>
ZOOKEEPER_SERVER_QUORUM_MEMBERSHIP Role health test bad Critical The health test result for ZOOKEEPER_SERVER_QUORUM_MEMBERSHIP has become bad: Quorum membership status could not be detected for the last 3 minute(s). At the last connection attempt the ZooKeeper server was in leader election.

 

2. Opening the Hue Web UI can bring up a blank page and occasionally the Hue Server (Home > Hue  > Instances > Hue Server) shows bad status, with the following error:

Web Server Status ( <cluster_name> hue Hue Server bdanode04.domain.com ) February 23, 2017, 3:31 PM EST
Test of whether this role's web server is responding to requests for metrics.
Bad : The Cloudera Manager Agent is not able to communicate with this role's web server.

3. Checking the MySQL master status intermittently fails with: 

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms