My Oracle Support Banner

Coherence Node Hangs Due to Task Backlog (Doc ID 2431483.1)

Last updated on AUGUST 03, 2018

Applies to:

Oracle Coherence - Version 3.7.1.12 and later
Information in this document applies to any platform.

Symptoms

Coherence cluster with 12 storage nodes which reside on 2 bare metal servers (6 storage nodes per server).
WLS Clients interact with cache by joining the cluster with localstorage=false option.

Problem Description:


1. One of storage nodes start facing huge task backlog (~24k) for DistributedCache service.
2. Whole Coherence cluster becomes nearly unresponsive.
3. CPU usage drops from ~80% to 20% and lower.
4. No event logged in logs at around the time issue appeared.
5. To bring cluster back to normal, sometimes it is enough to kill bad storage node. Sometimes whole cluster needs to be restarted.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.