My Oracle Support Banner

Grid Infrastructure 11gR2 ROOT.SH Fails on Second Node due to Firewall (Doc ID 1103313.1)

Last updated on JUNE 16, 2021

Applies to:

Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Backup Service - Version N/A and later
Oracle Database Cloud Exadata Service - Version N/A and later
Oracle Database Cloud Service - Version N/A and later
Information in this document applies to any platform.

Symptoms

Fresh installation of 11gR2 Grid Infrastructure with no error reported by runcluvfy.sh, root.sh finishes fine on first node, but times out and fails on other nodes:

$GRID_HOME/cfgtoollogs/crsconfig/rootcrs_$HOST.log on other nodes:

2010-04-23 13:45:11: CRS-2676: Start of 'ora.diskmon' on 'node2' succeeded
2010-04-23 13:45:11: CRS-2676: Start of 'ora.cssd' on 'node2' succeeded
2010-04-23 13:45:12: Start of resource "ora.ctssd -init -env USR_ORA_ENV=CTSS_REBOOT=TRUE" Succeeded
2010-04-23 13:50:14: Start of resource "ora.asm -init" Succeeded     >>>> Note a 5 minutes gap
2010-04-23 13:50:16: Start of resource "ora.crsd -init" Succeeded
2010-04-23 13:50:17: Start of resource "ora.evmd -init" Succeeded
2010-04-23 13:50:17: Successfully started Oracle clusterware stack
2010-04-23 13:50:17: Waiting for Oracle CRSD and EVMD to start
2010-04-23 13:50:22: Waiting for Oracle CRSD and EVMD to start
..

$GRID_HOME/log/$HOST/ohasd/ohasd.log on other nodes:

2010-04-23 13:45:12.685: [   CRSPE][1347090752] CRS-2672: Attempting to start 'ora.asm' on 'node2'
..                              >>>> Note a 5 minutes gap
2010-04-23 13:50:14.756: [   CRSPE][1347090752] CRS-2676: Start of 'ora.asm' on 'node2' succeeded
..
2010-04-23 13:50:14.914: [   CRSPE][1347090752] CRS-2672: Attempting to start 'ora.crsd' on 'node2'

$GRID_HOME/log/$HOST/crsd/crsd.log on other nodes:

2010-04-23 13:50:16.008: [  OCRASM][4169789936]proprasmo: Error in open/create file in dg [DATA]
[  OCRASM][4169789936]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup

2010-04-23 13:50:16.010: [  OCRASM][4169789936]proprasmo: kgfoCheckMount returned [7]
2010-04-23 13:50:16.010: [  OCRASM][4169789936]proprasmo: The ASM instance is down
2010-04-23 13:50:16.010: [  OCRRAW][4169789936]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2010-04-23 13:50:16.010: [  OCRRAW][4169789936]proprioo: No OCR/OLR devices are usable
2010-04-23 13:50:16.010: [  OCRASM][4169789936]proprasmcl: asmhandle is NULL
2010-04-23 13:50:16.010: [  OCRRAW][4169789936]proprinit: Could not open raw device
2010-04-23 13:50:16.010: [  OCRASM][4169789936]proprasmcl: asmhandle is NULL
2010-04-23 13:50:16.010: [  OCRAPI][4169789936]a_init:16!: Backend init unsuccessful : [26]
2010-04-23 13:50:16.010: [  CRSOCR][4169789936] OCR context init failure.  Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge
ORA-15077: could not locate ASM instance serving a required diskgroup
] [7]
2010-04-23 13:50:16.010: [    CRSD][4169789936][PANIC] CRSD exiting: Could not init OCR, code: 26
2010-04-23 13:50:16.011: [    CRSD][4169789936] Done.

alert_+ASM1.log on first node:

Fri Apr 23 13:45:19 2010
Reconfiguration started (old inc 2, new inc 4)
List of instances:
 1 2 (myinst: 1)
 Global Resource Directory frozen
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Fri Apr 23 13:50:40 2010
IPC Send timeout detected. Sender: ospid 20999 [oracle@node1 (PING)]
Receiver: inst 2 binc 462889467 ospid 13885
Fri Apr 23 13:50:57 2010
IPC Send timeout detected. Sender: ospid 21015 [oracle@node1 (LMD0)]
Receiver: inst 2 binc 462889757 ospid 13901
IPC Send timeout to 2.0 inc 4 for msg type 53 from opid 10
Fri Apr 23 13:50:59 2010
Communications reconfiguration: instance_number 2
Evicting instance 2 from cluster
Waiting for instances to leave:
2
Fri Apr 23 13:50:59 2010
Trace dumping is performing id=[cdmp_20100423135059]
Reconfiguration started (old inc 4, new inc 8)
List of instances:
 1 (myinst: 1)
 Nested reconfiguration detected.

alert_+ASMn.log on other nodes:

Fri Apr 23 13:45:18 2010
MMNL started with pid=21, OS id=13947
lmon registered with NM - instance number 2 (internal mem no 1)
Reconfiguration started (old inc 0, new inc 4)
ASM instance
List of instances:
 1 2 (myinst: 2)
 Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
 Communication channels reestablished
Fri Apr 23 13:50:53 2010
IPC Send timeout detected. Sender: ospid 13885 [oracle@node2 (PING)]
Receiver: inst 1 binc 462852653 ospid 20999
Fri Apr 23 13:50:59 2010
Received an instance abort message from instance 1
Please check instance 1 alert and LMON trace files for detail.
LMS0 (ospid: 13905): terminating the instance due to error 481
Fri Apr 23 13:50:59 2010
System state dump is made for local instance
Trace dumping is performing id=[cdmp_20100423135059]
Instance terminated by LMS0, pid = 13905

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.