BDA Host in "Bad" Health with: "This host has been out of contact with Cloudera Manager for too long" in Cloudera Manager (Doc ID 2097907.1)

Last updated on JANUARY 19, 2016

Applies to:

Big Data Appliance Integrated Software - Version 4.3.0 and later
Linux x86-64

Symptoms

Cloudera Manager (CM) shows a host in "Bad" health.

Additional symptoms show:

1. Cloudera Manager "Health Tests: shows the BDA host in "bad" health with:

"This host has been out of contact with Cloudera Manager for too long"

2. Cloudera Manager "Health History" for the host reports:

"Agent Status Bad"

3. The log file on the affected host at: /var/log/cloudera-scm-agent.log reports errors like:

[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to read file descriptor max for process <pid>: [Errno 2] No such file or directory: '/proc/<pid>/limits'
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to get file descriptor count for process <pid>: [Errno 2] No such file or directory: '/proc/<pid>/fd/'
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to get process metrics <pid> no process found with pid <pid>
...
[18/Jan/2016 12:37:01 +0000] 14503 Monitor-MasterMonitor throttling_logger ERROR (59 skipped) Error fetching metrics at 'http://bdanode0x.example.com:60010/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 385, in collect_metrics_from_url
openedUrl = self.urlopen(url, username=username, password=password)
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 349, in urlopen
password=password)
File "/usr/lib64/cmf/agent/src/cmf/url_util.py", line 62, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib64/python2.6/urllib2.py", line 1190, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms