Solaris Cluster 3.x 4.x Resource fails fail over or start due to allowable number of tries. Pingpong Avoidance Algorithms Explained
(Doc ID 1003759.1)
Last updated on OCTOBER 10, 2022
Applies to:
Solaris Cluster Geographic Edition - Version 3.0 to 4.3 [Release 3.0 to 4.3] Solaris Cluster - Version 3.0 to 4.3 [Release 3.0 to 4.3] Oracle Solaris on SPARC (64-bit) Oracle Solaris on x86-64 (64-bit)
Goal
If a Solaris Cluster resource is allowed to restart when it keeps failing on nodes of a cluster and switching back and forth, the resource group is said to be pingpong-ing.
To prevent such circumstances, the cluster framework implements pingpong avoidance algorithms to prevent switching a resource group back to a node where it has already failed.
In Sun Cluster there are two pingpong avoidance algorithms: These algorithms are the rebalance and pingpong_check algorithms. The rebalance algorithm is triggered when the resource group fails to start whereas the pingpong_check is triggered when a fault monitor for one of resources in the resource group detects a failure after the resource group has been successfully started.
This document describes the algorithms and their associated resources, and gives a procedure to demonstrate the algorithms in action.
To troubleshoot such a situation refer to: <Document 1020354.1> Solaris Cluster Why Resource Group or Resource is pingpong-ing (failover repeatedly) Between Node/System/Server
Solution
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!