Coherence Node Hangs Due to Task Backlog (Doc ID 2431483.1)

Last updated on OCTOBER 02, 2024

Applies to:

Oracle Coherence - Version 3.7.1.12 and later
Information in this document applies to any platform.

Symptoms

Coherence cluster with 12 storage nodes which reside on 2 bare metal servers (6 storage nodes per server).
WLS Clients interact with cache by joining the cluster with localstorage=false option.

Problem Description:

1. One of storage nodes start facing huge task backlog (~24k) for DistributedCache service.
2. Whole Coherence cluster becomes nearly unresponsive.
3. CPU usage drops from ~80% to 20% and lower.
4. No event logged in logs at around the time issue appeared.
5. To bring cluster back to normal, sometimes it is enough to kill bad storage node. Sometimes whole cluster needs to be restarted.

Changes

Cause

	To view full details, sign in with your My Oracle Support account.
	Don't have a My Oracle Support account? Click to get started!

In this Document

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.

Coherence Node Hangs Due to Task Backlog (Doc ID 2431483.1)

Applies to:

Symptoms

Changes

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!