My Oracle Support Banner

Node(s) Loose Connection with CM with Agents Reporting: "This host is in Contact with Cloudera Manager Server. This host is not in contact with Host Monitor" (Doc ID 2363310.1)

Last updated on OCTOBER 09, 2018

Applies to:

Big Data Appliance Integrated Software - Version 4.0 and later
Linux x86-64

Symptoms

The reported symptoms are:

1. One or more nodes loose connection with Cloudera Manager(CM).

2. For those nodes loosing connection with CM the hosts are repeatedly reported as having 'good' and then 'bad' health.

3. For the host(s) in this state, CM reports the Agent Status as:

This host is in Contact with Cloudera Manager Server. This host is not in contact with Host Monitor

4. Checking associated cloudera-scm-agent log, /var/log/cloudera-scm-agent/cloudera-scm-agent.log shows

a) That the CM Agent is not able to get the details of the process directory.

[<timestamp>] xxxx Monitor-HostMonitor throttling_logger ERROR (9 skipped) Could not find local file system for /var/run/cloudera-scm-agent/process".

b) That the CM Agent is not able to fetch metrics:

[<timestamp>] xxxx Monitor-GenericMonitor throttling_logger ERROR (10 skipped) Error fetching metrics at 'http://bdanode0x.example.com:50070/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.9.0-py2.6.egg/cmf/monitor/generic/metric_collectors.py", line 200, in _collect_and_parse_and_returnself._adapter.safety_valve)) 

5. Restarting the cloudera-scm-agent on the host does not resolve the error.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.