DataNode Health turns BAD after Reboot of a Node in CDH Cluster on Oracle Big Data Appliance V2.5 and higher with On-Disk Encryption Enabled (Doc ID 1641871.1)

Last updated on OCTOBER 11, 2016

Applies to:

Big Data Appliance Integrated Software - Version 2.5.0 and later
Linux x86-64

Symptoms

On V2.5.0 Oracle Big Data Appliance(BDA) CDH Cluster, DataNode/s (DN) is in BAD Health . Trying to restart the DN also fails with errors.

From /var/log/hadoop-hdfs/hadoop-cmf-hdfs-DATANODE-<BDANode>.log.out

<Timepstamp> INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /u03/hadoop/dfs. The directory is already locked
<Timepstamp> WARN org.apache.hadoop.hdfs.server.common.Storage: Ignoring storage directory /u03/hadoop/dfs due to an exception
java.io.IOException: Cannot lock storage /u03/hadoop/dfs. The directory is already locked
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:634)
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:457)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:152)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:916)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:887)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
        at java.lang.Thread.run(Thread.java:724)
<Timepstamp> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (storage id unknown) service to <BDANode>/<NodeIP>:8022
java.io.IOException: All specified directories are not accessible or do not exist.
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:183)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:916)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:887)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
        at java.lang.Thread.run(Thread.java:724)
<Timepstamp> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (storage id unknown) service to <BDANode>/<NodeIP>:8022
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /u12/hadoop/dfs is in an inconsistent state: file VERSION has layoutVersion missing.
        at org.apache.hadoop.hdfs.server.common.Storage.getProperty(Storage.java:1035)
        at org.apache.hadoop.hdfs.server.common.Storage.setLayoutVersion(Storage.java:1075)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:310)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:302)
        at org.apache.hadoop.hdfs.server.common.Storage.readProperties(Storage.java:916)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:388)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:191)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:916)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:887)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
        at java.lang.Thread.run(Thread.java:724)



Changes

 The BDA node where the bad DataNode resides has been rebooted.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms