My Oracle Support Banner

On BDA A Resource Manager Goes Down with "Exception while executing a ZK operation." (Doc ID 2542137.1)

Last updated on NOVEMBER 06, 2019

Applies to:

Big Data Appliance Integrated Software - Version 4.9.0 and later
Linux x86-64

Symptoms

On BDA, a Resource Manager goes down with errors like below:

From the Resource Manager logs:

FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type STATE_STORE_OP_FAILED. Cause:org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
...
<TIMESTAMP> INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server <HOSTNAME>.<DOMAINNAME>/<PRIVATE_IP>:<PORT>, sessionid = <ID>, negotiated timeout = 60000

<TIMESTAMP> INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid <ID>, likely server has closed socket, closing socket connection and attempting reconnect
<TIMESTAMP> INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Exception while executing a ZK operation. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss

From the Zookeeper logs:

<TIMESTAMP> WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session <ID> due to java.io.IOException: Len error <LARGE_SIZE>
<TIMESTAMP> INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /<PRIVATE_IP>:<PORT> which had sessionid <ID>

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.