My Oracle Support Banner

After Reboot NameNodes Fail to Come out of Safe Mode and the Corresponding DataNode Logs Report "NameNode: Registration IDs mismatched:" (Doc ID 2258646.1)

Last updated on JANUARY 17, 2020

Applies to:

Big Data Appliance Integrated Software - Version 4.7.0 and later
Linux x86-64

Symptoms

NOTE: In the examples that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

After reboot NameNodes fail to come out of Safe Mode. This can happen after any reboot, including the reboot step taking place during a BDA upgrade.

1. The NameNode logs hadoop-cmf-hdfs-NAMENODE-<HOSTNAME>.<DOMAIN>-log.out show errors like:

a) Cannot create the canary file:

ModeException: Cannot create file/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2017_04_18-20_11_18. Name node is in safe mode.
The reported blocks 0 needs additional 19319 blocks to reach the threshold 0.9990 of total blocks 19338.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1450)

b) Other Safe Mode related messaging shows output like:

2017-04-18 12:17:38,168 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON.
The reported blocks 0 needs additional 19319 blocks to reach the threshold 0.9990 of total blocks 19338.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

2017-04-18 12:17:42,264 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state
2017-04-18 12:17:42,265 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:347)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:444)
...

2. The corresponding DataNode log shows output like:

...
2017-04-18 12:23:42,275 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Registration IDs mismatched: the DatanodeRegistration ID is NS-1678684207-<CLUSTER_NAME>-1482490739387 but the expected ID is NS-1678684207-<CLUSTER_NAME>-1492511019085
2017-04-18 12:23:44,439 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Registration IDs mismatched: the DatanodeRegistration ID is NS-1678684207-<CLUSTER_NAME>-1482490739387 but the expected ID is NS-1678684207-<CLUSTER_NAME>-1492511019085
2017-04-18 12:23:45,945 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Registration IDs mismatched: the DatanodeRegistration ID is NS-1678684207-<CLUSTER_NAME>-1482490739387 but the expected ID is NS-1678684207-<CLUSTER_NAME>-1492511019085
...

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.