My Oracle Support Banner

Hardware failure on a physical node causes Coherence Cluster health became unstable (Doc ID 2433996.1)

Last updated on AUGUST 16, 2018

Applies to:

Oracle Coherence - Version 12.2.1.0.0 to 12.2.1.3.0 [Release 12c]
Information in this document applies to any platform.

Symptoms

Coherence cluster health became unstable due to one of the nodes' hardware failure. In the logs, user sees the soft timeout detected messages but cluster never recovered until they restarted the entire cluster. As a workaround, tweaked the ip-timeout/packet-timeout in the override XML file. User concern is why should a need to configure the death detection parameters? By default, the ip-timeout is total of 15 sec. Why not the failed member removed from the Coherence cluster after it reached the default timeout value?

 

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.