BDA Host in "Bad" Health with: "This host has been out of contact with Cloudera Manager for too long" in Cloudera Manager
(Doc ID 2097907.1)
Last updated on JULY 05, 2021
Applies to:
Big Data Appliance Integrated Software - Version 4.3.0 and laterLinux x86-64
Symptoms
Cloudera Manager (CM) shows a host in "Bad" health.
Additional symptoms show:
1. Cloudera Manager "Health Tests: shows the BDA host in "bad" health with:
"This host has been out of contact with Cloudera Manager for too long"
2. Cloudera Manager "Health History" for the host reports:
"Agent Status Bad"
3. The log file on the affected host at: /var/log/cloudera-scm-agent.log reports errors like:
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to read file descriptor max for process <PID>: [Errno 2] No such file or directory: '/proc/<PID>/limits'
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to get file descriptor count for process <PID>: [Errno 2] No such file or directory: '/proc/<PID>/fd/'
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to get process metrics <PID> no process found with pid <PID>
...
[18/Jan/2016 12:37:01 +0000] 14503 Monitor-MasterMonitor throttling_logger ERROR (59 skipped) Error fetching metrics at 'http://<HOSTNAMEx>.<DOMAIN>:60010/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 385, in collect_metrics_from_url
openedUrl = self.urlopen(url, username=username, password=password)
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 349, in urlopen
password=password)
File "/usr/lib64/cmf/agent/src/cmf/url_util.py", line 62, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib64/python2.6/urllib2.py", line 1190, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to get file descriptor count for process <PID>: [Errno 2] No such file or directory: '/proc/<PID>/fd/'
[18/Jan/2016 00:47:01 +0000] 14503 MonitorDaemon-Reporter proc_metrics_utils ERROR Failed to get process metrics <PID> no process found with pid <PID>
...
[18/Jan/2016 12:37:01 +0000] 14503 Monitor-MasterMonitor throttling_logger ERROR (59 skipped) Error fetching metrics at 'http://<HOSTNAMEx>.<DOMAIN>:60010/jmx'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 385, in collect_metrics_from_url
openedUrl = self.urlopen(url, username=username, password=password)
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 349, in urlopen
password=password)
File "/usr/lib64/cmf/agent/src/cmf/url_util.py", line 62, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib64/python2.6/urllib2.py", line 1190, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 111] Connection refused>
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |