A BDA DataNode Fails to Start - Can Not Bind to DataNode Port as it is Already in Use - No Log Files Exist (Doc ID 2042065.1)

Last updated on FEBRUARY 06, 2022

Applies to:

Big Data Appliance Integrated Software - Version 4.2.0 to 4.4.0 [Release 4.2 to 4.4]
Linux x86-64

Symptoms

NOTE: In the examples that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

A DataNode (DN) can not start. This leads to hdfs being in "bad" health. This can happen during a Mammoth action as well when the services are restarted.

Note: This can also happen with any DataNode port for example, depending on version, port 1006 which is also a DataNode port, and port 50010, etc. The example here uses DataNode port 1004.

Using an example of DataNode port 1004:

/var/log/hadoop-hdfs/jsvc.err shows that the process can not bind to the DataNode port 1004 (0.0.0.0:1004) as the port is already in use. Note however, that there is no other obvious logging on the system to indicate why the DN is failing to come up.

Additional Symptoms look like:

1. CM reports a DataNode failing to start, bringing hdfs into "bad" health.

2. Initially no obvious logging is found to explain the problem.

a) No updates at all are being made to the DataNode log file at: /var/log/hadoop-hdfs/hadoop-cmf-hdfs-DATANODE-<FQDN>-log.out.

Restarting the DataNode in Cloudera Manager while tailing the associated DataNode log, hadoop-cmf-hdfs-DATANODE-<HOSTNAME>.<DOMAIN>.log.out with ,"tail -f" as below shows no updates at all being made to the DataNode log.

Cause

	To view full details, sign in with your My Oracle Support account.
	Don't have a My Oracle Support account? Click to get started!

In this Document

Symptoms

Cause

Solution

No Custom Mount Points Set Up

Custom Mount Points Set Up

Verify that Unmounting an NFS Mount Point Will Not Impact the Cloudera Services on the Host with the Failed DataNode

Detailed Steps to Unmount if Custom NFS Mounts Occupying DataNode Ports

Stop the NFS Service

Reboot the Server

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.

A BDA DataNode Fails to Start - Can Not Bind to DataNode Port as it is Already in Use - No Log Files Exist (Doc ID 2042065.1)

Applies to:

Symptoms

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!