Solaris Cluster More Than 2-Nodes Inappropriate Resource Group Switch in Case of Server/Node shutdown/evacuate due to Strong RG_affinities (Doc ID 1370760.1)

Last updated on OCTOBER 27, 2016

Applies to:

Solaris Cluster - Version 3.2 12/06 to OSC 4.3 [Release 3.2 to 4.3]
Oracle Solaris on SPARC (64-bit)
Oracle Solaris on SPARC (32-bit)
Oracle Solaris on x86-64 (64-bit)
Oracle Solaris on x86 (32-bit)

Goal

The goal is to understand the RGM (Resource Group Manager) scoring factors of strong affinities in case of evacuate/node shutdown in a Solaris Cluster with more than 2 nodes.

Following configuration:
root@node1 # clrg show -v | egrep "Resource Group|RG_dependencies|RG_affinities|Nodelist"|grep -v NULL

=== Resource Groups and Resources ===         
Resource Group:                                 trp-db-rg
  Nodelist:                                        node1 node2 node3
Resource Group:                                 trp-ci-rg
  RG_affinities:                                   ++trp-db-rg
  Nodelist:                                        node1 node2 node3
  RG_dependencies:                                 trp-db-rg
Resource Group:                                 clone-db-rg
  RG_affinities:                                   --trp-db-rg --trp-ci-rg --sap-di-fo-rg
  Nodelist:                                        node2:trpclone
Resource Group:                                 clone-ci-rg
  RG_affinities:                                   ++clone-db-rg
  Nodelist:                                        node2:trpclone
Resource Group:                                 sap-di-fo-rg
  RG_affinities:                                   --trp-ci-rg --trp-db-rg
  Nodelist:                                        node3 node2

root@node1 # clrg status

=== Cluster Resource Groups ===

Group Name      Node Name           Suspended   Status
----------      ---------           ---------   ------
trp-db-rg       node1               No          Online
                node2               No          Offline
                node3               No          Offline

trp-ci-rg       node1               No          Online
                node2               No          Offline
                node3               No          Offline

clone-db-rg     node2:trpclone      No          Online

clone-ci-rg     node2:trpclone      No          Online

sap-di-fo-rg    node3               No          Online
                node2               No          Offline

With this given configuration, we would expect when we evacuate/shutdown node 1, that

1. trp-db-rg/trp-ci-rg should be switched to node2 (next node in node list) and
2. clone-db-rg/clone-ci-rg should be taken offline (RG_affinities:--trp-db-rg --trp-ci-rg --sap-di-fo-rg and Nodelist: node2:trpclone only)
3.  sap-di-fo-rg will not be switched and stay on node3

What happens is, that

1. trp-db-rg/trp-ci-rg are being switched from node1 to node3
2. sap-di-fo-rg is switched from node3 to node2
3. clone-db-rg/clone-ci-rg are taken offline


root@node1 # clrg evacuate -n node1
root@node1 # clrg status

=== Cluster Resource Groups ===

Group Name      Node Name           Suspended   Status
----------      ---------           ---------   ------
trp-db-rg       node1               No          Offline
                node2               No          Offline
                node3               No          Online
 
trp-ci-rg       node1               No          Offline
                node2               No          Offline
                node3               No          Online

clone-db-rg     node2:trpclone      No          Offline

clone-ci-rg     node2:trpclone      No          Offline


sap-di-fo-rg    node3               No          Offline
                node2               No          Online


Explanation:
The reason for this behavior is, when RG affinities are specified, they take precedence over the nodelist order. The RGM computes a set of node scoring factors to decide where to bring trp-db-rg/trp-ci-rg online.

In this case, RGM chooses node3 over node2 because on node2 it would have to kick two other RGs offline (clone-db-rg and clone-ci-rg); whereas on node3 only one RG is affected (sap-di-fo-rg).

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms