My Oracle Support Banner

Solaris Cluster More Than 2-Nodes Inappropriate Resource Group Switch in Case of Server/Node shutdown/evacuate due to Strong RG_affinities (Doc ID 1370760.1)

Last updated on MAY 02, 2024

Applies to:

Solaris Cluster - Version 3.2 to 4.3 [Release 3.2 to 4.3]
Oracle Solaris on SPARC (64-bit)
Oracle Solaris on SPARC (32-bit)
Oracle Solaris on x86-64 (64-bit)
Oracle Solaris on x86 (32-bit)

Goal

The goal is to understand the RGM (Resource Group Manager) scoring factors of strong affinities in case of evacuate/node shutdown in a Solaris Cluster with more than 2 nodes.

Following configuration:
root@node1 # clrg show -v | egrep "Resource Group|RG_dependencies|RG_affinities|Nodelist"|grep -v NULL

=== Resource Groups and Resources ===         
Resource Group:                                 mydb-rg
  Nodelist:                                        node1 node2 node3
Resource Group:                                 my2-ci-rg
  RG_affinities:                                   ++mydb-rg
  Nodelist:                                        node1 node2 node3
  RG_dependencies:                                 mydb-rg
Resource Group:                                 my-db-rg
  RG_affinities:                                   --mydb-rg --my2-ci-rg --my02-rg
  Nodelist:                                        node2:my2my
Resource Group:                                 my-ci-rg
  RG_affinities:                                   ++my-db-rg
  Nodelist:                                        node2:my2my
Resource Group:                                 my02-rg
  RG_affinities:                                   --my2-ci-rg --mydb-rg
  Nodelist:                                        node3 node2

root@node1 # clrg status

=== Cluster Resource Groups ===

Group Name      Node Name           Suspended   Status
----------      ---------           ---------   ------
mydb-rg       node1               No          Online
                node2               No          Offline
                node3               No          Offline

my2-ci-rg       node1               No          Online
                node2               No          Offline
                node3               No          Offline

my-db-rg     node2:my2my      No          Online

my-ci-rg     node2:my2my      No          Online

my02-rg    node3               No          Online
                node2               No          Offline

With this given configuration, we would expect when we evacuate/shutdown node 1, that

1. mydb-rg/my2-ci-rg should be switched to node2 (next node in node list) and
2. my-db-rg/my-ci-rg should be taken offline (RG_affinities:--mydb-rg --my2-ci-rg --my02-rg and Nodelist: node2:my2my only)
3.  my02-rg will not be switched and stay on node3

What happens is, that

1. mydb-rg/my2-ci-rg are being switched from node1 to node3
2. my02-rg is switched from node3 to node2
3. my-db-rg/my-ci-rg are taken offline


root@node1 # clrg evacuate -n node1
root@node1 # clrg status

=== Cluster Resource Groups ===

Group Name      Node Name           Suspended   Status
----------      ---------           ---------   ------
mydb-rg       node1               No          Offline
                node2               No          Offline
                node3               No          Online
 
my2-ci-rg       node1               No          Offline
                node2               No          Offline
                node3               No          Online

my-db-rg     node2:my2my      No          Offline

my-ci-rg     node2:my2my      No          Offline


my02-rg    node3               No          Offline
                node2               No          Online


Explanation:
The reason for this behavior is, when RG affinities are specified, they take precedence over the nodelist order. The RGM computes a set of node scoring factors to decide where to bring mydb-rg/my2-ci-rg online.

In this case, RGM chooses node3 over node2 because on node2 it would have to kick two other RGs offline (my-db-rg and my-ci-rg); whereas on node3 only one RG is affected (my02-rg).

Solution

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.