My Oracle Support Banner

Zookeeper Will Not Start on One BDA Node Due to Error Reading Snaphot Files (Doc ID 1923949.1)

Last updated on NOVEMBER 07, 2019

Applies to:

Big Data Appliance Integrated Software - Version 2.4.0 and later
Linux x86-64

Symptoms

NOTE: In the examples that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

 

One of the Zookeeper nodes is down in Cloudera Manager and will not start.  Examining the log files which can be found on the affected BDA node at:

/var/log/zookeeper/zookeeper-cmf-zookeeper-SERVER-bdanode0x.example.com.log

Shows:

INFO org.apache.zookeeper.server.quorum.QuorumPeer: initLimit set to 10
INFO org.apache.zookeeper.server.persistence.FileSnap: Reading snapshot
     /var/lib/zookeeper/version-2/snapshot.<ID>
ERROR org.apache.zookeeper.server.persistence.FileTxnSnapLog:
     Parent /cloudera_manager_zookeeper_canary missing for /cloudera_manager_zookeeper_canary/zookeeper_ZS_bdanode0x
ERROR org.apache.zookeeper.server.quorum.QuorumPeer: Unable to load database on disk
java.io.IOException: Failed to process transaction type: 1 error: KeeperErrorCode = NoNode for /cloudera_manager_zookeeper_canary
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:188)
        at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
        at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:417)
        at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:409)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:156)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /cloudera_manager_zookeeper_canary
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:250)
...
ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain: Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server
        at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:454)
        at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:409)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:156)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
Caused by: java.io.IOException: Failed to process transaction type: 1 error:
       KeeperErrorCode = NoNode for /cloudera_manager_zookeeper_canary
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:188)
 

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.