On Oracle Big Data Appliance Active NameNode Abruptly Goes Down

(Doc ID 2063191.1)

Last updated on OCTOBER 11, 2016

Applies to:

Big Data Appliance Integrated Software - Version 4.0 and later
Linux x86-64

Symptoms

On Oracle Big Data Appliance, Active NameNode(NN) on node01 stops and automatically fails over to node02

NN logs indicate below error

Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.***.8.1:8485, 192.***.8.2:8485, 192.***.8.3:8485], stream=QuorumOutputStream starting at txid 46228185))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 1 successful responses:
192.***.8.3:8485: null [success]
2 exceptions thrown:
192.***.8.2:8485: Journal disabled until next roll
192.***.8.1:8485: End of File Exception between local host is: "192.****.8.1"; destination host is: "":8485; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms