After Reboot NameNodes Fail to Come out of Safe Mode and the Corresponding DataNode Logs Report "NameNode: Registration IDs mismatched:" (Doc ID 2258646.1)

Last updated on APRIL 29, 2017

Applies to:

Big Data Appliance Integrated Software - Version 4.7.0 and later
Linux x86-64

Symptoms

After reboot NameNodes fail to come out of Safe Mode. This can happen after any reboot, including the reboot step taking place during a BDA upgrade.

1. The NameNode logs hadoop-cmf-hdfs-NAMENODE-host.domain.com-log.out show errors like:

a) Cannot create the canary file:

ModeException: Cannot create file/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2017_04_18-20_11_18. Name node is in safe mode.
The reported blocks 0 needs additional 19319 blocks to reach the threshold 0.9990 of total blocks 19338.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1450)

b) Other Safe Mode related messaging shows output like:

2017-04-18 12:17:38,168 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON.
The reported blocks 0 needs additional 19319 blocks to reach the threshold 0.9990 of total blocks 19338.
The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.

2017-04-18 12:17:42,264 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state
2017-04-18 12:17:42,265 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:347)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:444)
...

2. The corresponding DataNode log shows output like:

...
2017-04-18 12:23:42,275 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Registration IDs mismatched: the DatanodeRegistration ID is NS-1678684207-cluster1-1482490739387 but the expected ID is NS-1678684207-cluster1-1492511019085
2017-04-18 12:23:44,439 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Registration IDs mismatched: the DatanodeRegistration ID is NS-1678684207-cluster1-1482490739387 but the expected ID is NS-1678684207-cluster1-1492511019085
2017-04-18 12:23:45,945 WARN org.apache.hadoop.hdfs.server.namenode.NameNode: Registration IDs mismatched: the DatanodeRegistration ID is NS-1678684207-cluster1-1482490739387 but the expected ID is NS-1678684207-cluster1-1492511019085
...

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms