My Oracle Support Banner

DataNode Health Fails to Start with 'Block pool ID needed' and 'HDFS Partions already locked' Errors on a Cluster without On-Disk Encryption Enabled (Doc ID 1671000.1)

Last updated on NOVEMBER 08, 2022

Applies to:

Big Data Appliance Integrated Software - Version 2.5.0 and later
Linux x86-64

Symptoms

On V2.5.0/V3.0 Oracle Big Data Appliance(BDA) CDH Cluster, DataNode(s) (DN) is in BAD Health.

Trying to restart the DN also fails with errors.

From /var/log/hadoop-hdfs/hadoop-cmf-hdfs-DATANODE-<BDANode>.log.out some errors noticed are:

<Timestamp> INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /u0*/hadoop/dfs. The directory is already locked
<Timestamp> WARN org.apache.hadoop.hdfs.server.common.Storage: Ignoring storage directory /u0*/hadoop/dfs due to an exception
java.io.IOException: Cannot lock storage /u03/hadoop/dfs. The directory is already locked
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:634)
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:457)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:152)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:916)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:887)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
        at java.lang.Thread.run(Thread.java:724)
<Timestamp> FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (storage id unknown) service to <BDANodex>/<NodeIP>:8022
java.io.IOException: All specified directories are not accessible or do not exist.
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:183)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:219)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:916)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:887)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
        at java.lang.Thread.run(Thread.java:724)
........
org.apache.hadoop.hdfs.server.datanode.DataNode  Block pool ID needed, but service not yet registered with NN
java.lang.Exception: trace
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:856)
    at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:617)
    at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
    at java.lang.Thread.run(Thread.java:724)


On-Disk Encryption is NOT enabled on the cluster.

# bdacli getinfo cluster_disk_encryption_enabled
false

Though On-Disk Encryption is not enabled, executing 'mount -l' command shows that HDFS Partitions are mounted using ecryptfs option.

 /u<nn>/hadoop on /u<nn>/hadoop type ecryptfs (rw,ecryptfs_sig=53be2d8148c3f5d4,ecryptfs_unlink_sigs,ecryptfs_cipher=aes,ecryptfs_key_bytes=16)

Also trying to check Version details of HDFS data fails with errors:

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.