My Oracle Support Banner

Solaris Cluster Simultaneous Shutdown of Multiple Nodes can Panic All Nodes when Quorum Server is Used (Doc ID 2242555.1)

Last updated on APRIL 06, 2018

Applies to:

Solaris Cluster - Version 3.2 to 4.3 [Release 3.2 to 4.3]
Oracle Solaris on SPARC (64-bit)
Oracle Solaris on x86-64 (64-bit)

Symptoms

3 nodes from a 5 node Solaris Cluster had been shut down in preparation of a planned power-outage for these nodes.
A "init 0" was issued simultaneously on node1, node2, node3.

The node4 and node5 were expected to stay up.

Result from this shutdown action:
- only on node3 shut down proceeded cleanly, as expected
- all other nodes panicked with: "Cluster lost operational quorum; aborting"

This cluster uses a quorum server as quorum device.

Console node1:

...
svc.startd: The system is coming down. Please wait.
svc.startd: 231 system services are now being stopped.
NOTICE: clcomm: Path node1:hb2 - node3:hb2 being drained
NOTICE: clcomm: Path node1:hb1 - node3:hb1 being drained
TCP_IOC_ABORT_CONN: local = 000.000.000.000:0, remote = 172.016.050.131:0, start = -2, end = 6
TCP_IOC_ABORT_CONN: aborted 0 connection
WARNotifying cluster that this node is panicking
panic[cpu90]/thread=c4028922c640: CMM: Cluster lost operational quorum; aborting.

Console node2:

...
Sep 24 01:44:35 node2 cl_runtime: [ID - kern.notice] [ID 489438 kern.notice] NOTICE: clcomm: Path node2:hb1 - node3:hb1 being drained
Sep 24 01:44:35 node2 cl_runtime: [ID - kern.notice] [ID 489438 kern.notice] NOTICE: clcomm: Path node2:hb2 - node3:hb2 being drained
Sep 24 01:44:35 node2 ip: [ID - kern.notice] [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 000.000.000.000:0, remote = 172.016.050.131:0, start = -2, end = 6
Sep 24 01:44:35 node2 ip: [ID - kern.notice] [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
Notifying cluster that this node is panicking
panic[cpu236]/thread=c402897b0ac0: CMM: Cluster lost operational quorum; aborting.

Console node3:

...
svc.startd: The system is coming down. Please wait.
svc.startd: 231 system services are now being stopped.
syncing file systems... done
WARNING: CMM: Node being shut down.
Program terminated

Console node4:

...
Sep 24 01:44:30 node4 cl_runtime: [ID - kern.notice] [ID 273354 kern.notice] NOTICE: CMM: Node node3 (nodeid = 3) is dead.
Sep 24 01:44:35 node4 cl_runtime: [ID - kern.notice] [ID 489438 kern.notice] NOTICE: clcomm: Path node4:hb1 - node3:hb1 being drained
Sep 24 01:44:35 node4 cl_runtime: [ID - kern.notice] [ID 489438 kern.notice] NOTICE: clcomm: Path node4:hb2 - node3:hb2 being drained
Sep 24 01:44:35 node4 ip: [ID - kern.notice] [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 000.000.000.000:0, remote = 172.016.050.131:0, start = -2, end = 6
Sep 24 01:44:35 node4 ip: [ID - kern.notice] [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
Notifying cluster that this node is panicking
panic[cpu60]/thread=c4031814fdc0: CMM: Cluster lost operational quorum; aborting.

Console node5:

...
Sep 24 01:44:30 node5 cl_runtime: [ID - kern.notice] [ID 273354 kern.notice] NOTICE: CMM: Node node3 (nodeid = 3) is dead.
Sep 24 01:44:34 node5 cl_runtime: [ID - kern.notice] [ID 489438 kern.notice] NOTICE: clcomm: Path node5:hb2 - node3:hb2 being drained
Sep 24 01:44:34 node5 cl_runtime: [ID - kern.notice] [ID 489438 kern.notice] NOTICE: clcomm: Path node5:hb1 - node3:hb1 being drained
Sep 24 01:44:34 node5 ip: [ID - kern.notice] [ID 678092 kern.notice] TCP_IOC_ABORT_CONN: local = 000.000.000.000:0, remote = 172.016.050.131:0, start = -2, end = 6
Sep 24 01:44:34 node5 ip: [ID - kern.notice] [ID 302654 kern.notice] TCP_IOC_ABORT_CONN: aborted 0 connection
Notifying cluster that this node is panicking

 

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.