Solaris Cluster More Than 2-Nodes Inappropriate Resource Group Switch in Case of Server/Node shutdown/evacuate due to Strong RG_affinities
(Doc ID 1370760.1)
Last updated on MAY 02, 2024
Applies to:
Solaris Cluster - Version 3.2 to 4.3 [Release 3.2 to 4.3]Oracle Solaris on SPARC (64-bit)
Oracle Solaris on SPARC (32-bit)
Oracle Solaris on x86-64 (64-bit)
Oracle Solaris on x86 (32-bit)
Goal
The goal is to understand the RGM (Resource Group Manager) scoring factors of strong affinities in case of evacuate/node shutdown in a Solaris Cluster with more than 2 nodes.
Following configuration:
root@node1 # clrg show -v | egrep "Resource Group|RG_dependencies|RG_affinities|Nodelist"|grep -v NULL
=== Resource Groups and Resources ===
Resource Group: mydb-rg
Nodelist: node1 node2 node3
Resource Group: my2-ci-rg
RG_affinities: ++mydb-rg
Nodelist: node1 node2 node3
RG_dependencies: mydb-rg
Resource Group: my-db-rg
RG_affinities: --mydb-rg --my2-ci-rg --my02-rg
Nodelist: node2:my2my
Resource Group: my-ci-rg
RG_affinities: ++my-db-rg
Nodelist: node2:my2my
Resource Group: my02-rg
RG_affinities: --my2-ci-rg --mydb-rg
Nodelist: node3 node2
root@node1 # clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- --------- --------- ------
mydb-rg node1 No Online
node2 No Offline
node3 No Offline
my2-ci-rg node1 No Online
node2 No Offline
node3 No Offline
my-db-rg node2:my2my No Online
my-ci-rg node2:my2my No Online
my02-rg node3 No Online
node2 No Offline
With this given configuration, we would expect when we evacuate/shutdown node 1, that
1. mydb-rg/my2-ci-rg should be switched to node2 (next node in node list) and
2. my-db-rg/my-ci-rg should be taken offline (RG_affinities:--mydb-rg --my2-ci-rg --my02-rg and Nodelist: node2:my2my only)
3. my02-rg will not be switched and stay on node3
What happens is, that
1. mydb-rg/my2-ci-rg are being switched from node1 to node3
2. my02-rg is switched from node3 to node2
3. my-db-rg/my-ci-rg are taken offline
root@node1 # clrg evacuate -n node1
root@node1 # clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- --------- --------- ------
mydb-rg node1 No Offline
node2 No Offline
node3 No Online
my2-ci-rg node1 No Offline
node2 No Offline
node3 No Online
my-db-rg node2:my2my No Offline
my-ci-rg node2:my2my No Offline
node2 No Online
Explanation:
The reason for this behavior is, when RG affinities are specified, they take precedence over the nodelist order. The RGM computes a set of node scoring factors to decide where to bring mydb-rg/my2-ci-rg online.
In this case, RGM chooses node3 over node2 because on node2 it would have to kick two other RGs offline (my-db-rg and my-ci-rg); whereas on node3 only one RG is affected (my02-rg).
Solution
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |