Solaris Cluster 3.X: Cluster node is unable to boot with Failfast: timeout - unit "failfast_now" due to rgmd error (Doc ID 1467074.1)

Last updated on JANUARY 22, 2016

Applies to:

Solaris Cluster - Version 3.2 12/06 and later
Information in this document applies to any platform.

Symptoms

A cluster node is unable to boot with the following console messages.

Rebooting with command: boot
Boot device: disk  File and args:
SunOS Release 5.10 Version Generic_147440-15 64-bit
Copyright (c) 1983, 2012, Oracle and/or its affiliates. All rights reserved.
NOTICE: nxge0: xcvr addr:0x0d - link is up 1000 Mbps full duplex
NOTICE: nxge1: xcvr addr:0x0c - link is up 1000 Mbps full duplex
Failed to configure IPv4 interface(s): nxge1
Hostname: node02
NOTICE: nxge4: xcvr addr:0x0d - link is up 1000 Mbps full duplex
NOTICE: nxge5: xcvr addr:0x0c - link is up 1000 Mbps full duplex
NOTICE: nxge3: xcvr addr:0x0a - link is up 1000 Mbps full duplex
NOTICE: nxge2: xcvr addr:0x0b - link is up 1000 Mbps full duplex
NOTICE: nxge6: xcvr addr:0x0b - link is up 1000 Mbps full duplex
NOTICE: nxge7: xcvr addr:0x0a - link is up 1000 Mbps full duplex
Booting in cluster mode
NOTICE: CMM: Node node01 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node node02 (nodeid = 2) with votecount = 1 added.
NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d2s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3.

node02 console login: NOTICE: clcomm: Adapter nxge6 constructed
NOTICE: clcomm: Adapter nxge2 constructed
NOTICE: CMM: Node node02: attempting to join cluster.
NOTICE: nxge6: xcvr addr:0x0b - link is up 1000 Mbps full duplex
Jun  7 07:25:40 node02 nxge: NOTICE: nxge2: xcvr addr:0x0b - link is up 1000 Mbps full duplex
Jun  7 07:25:41 node02 cl_runtime: NOTICE: clcomm: Path node02:nxge2 - node01:nxge2 online
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: Node node01 (nodeid: 1, incarnation #: 1339048802) has become reachable.
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: Cluster has reached quorum.
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: Node node01 (nodeid = 1) is up; new incarnation number = 1339048802.
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: Node node02 (nodeid = 2) is up; new incarnation number = 1339050339.
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: Cluster members: node01 node02.
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: node reconfiguration #11 completed.
Jun  7 07:25:41 node02 cl_runtime: NOTICE: CMM: Node node02: joined cluster.
Jun  7 07:25:46 node02 cl_runtime: NOTICE: clcomm: Path node02:nxge6 - node01:nxge6 online
obtaining access to all attached disks
Jun  7 07:25:48 node02 savecore: Decompress the crash dump with
Jun  7 07:25:48 node02 'savecore -vf /var/crash/node02/vmdump.11'
Jun  7 07:25:49 node02 ntpdate[775]: no server suitable for synchronization found
Jun  7 07:25:50 node02 xntpd[885]: trusted key 0 unlikely
Jun  7 07:25:50 node02 xntpd[885]: 0 makes a poor request keyid
Jun  7 07:25:50 node02 xntpd[885]: 0 makes a poor control keyid
Jun  7 07:26:19 node02 genunix: NOTICE: core_log: rgmd[1680] core dumped: /var/crash/core-node02-1680-rgmd
Notifying cluster that this node is panicking
WARNING: Failfast: timeout - unit "failfast_now"global".

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms