My Oracle Support Banner

Oracle Coherence ECE Domain Cluster is Becoming Hung with OOME Reported in the Logs (Doc ID 2395358.1)

Last updated on MAY 11, 2018

Applies to:

Oracle Coherence - Version 12.2.1.0.0 to 12.2.1.3.0 [Release 12c]
Information in this document applies to any platform.

Symptoms

A customer has reported that in the Oracle ECE domain cluster got hung, there was no response from any process of the cluster, there was also an exception noted in the server log around that time - java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler...

Ex:

2018-04-10 14:54:59.122 AEST WARN - - - - Oracle Coherence 12.2.1.0.6 (thread=xxxFederatedCacheWorker:11, member=61): java.lang.OutOfMemoryError: Java heap space
2018-04-10 14:54:59.122 AEST ERROR - - - - Oracle Coherence 12.2.1.0.6 (thread=InvocationServiceWorker:63, member=61): Terminating InvocationService due to unhandled exception: java.lang.OutOfMemoryError
2018-04-10 14:54:59.123 AEST WARN - - - - Oracle Coherence 12.2.1.0.6 (thread=InvocationServiceWorker:63, member=61): java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter.initialValue(ReentrantReadWriteLock.java:289)
at java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter.initialValue(ReentrantReadWriteLock.java:286)
at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180)
at java.lang.ThreadLocal.get(ThreadLocal.java:170)
at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReadLock(ReentrantReadWriteLock.java:595)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:799)
at com.tangosol.util.ThreadGateLite.acquireLock(ThreadGateLite.java:246)
at com.tangosol.util.ThreadGateLite.enter(ThreadGateLite.java:84)
at com.tangosol.coherence.component.util.DaemonPool$Daemon.removeFromAnotherQueue(DaemonPool.CDB:18)
at com.tangosol.coherence.component.util.DaemonPool$Daemon

And also messages like...

2018-04-10 15:03:35.487 AEST WARN - - - - Oracle Coherence 12.2.1.0.6 (thread=Termination Thread, member=n/a): Terminating Connection {Peer=tmb://xxx:18094.63094, Service=TransportService,
Member=Member(Id=67, Timestamp=2018-03-30 20:04:52.018, Address=xxx:18094, MachineId=14231, Location=machine:xx25219,process:21195,member:xx28, Role=OracleCommunicationxxChargingServerChargingLauncher),
State=DISCONNECTED(java.io.IOException: fatal ack timeout after 10m)}

...

2018-04-10 15:03:35.495 AEST WARN - - - - Oracle Coherence 12.2.1.0.6 (thread=Termination Thread, member=n/a): Terminating Connection {Peer=tmb://xxx:18091.51100, Service=TransportService,
Member=Member(Id=50, Timestamp=2018-03-30 19:39:18.566, Address=xxx:18091, MachineId=14226, Location=machine:xx25214,process:22023,member:xx11, Role=OracleCommunicationxxChargingServerChargingLauncher),
State=DISCONNECTED(java.io.IOException: fatal ack timeout after 10m)}



Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.