CRS will not start if the disk permission / ownership is seted in rc.local insted of udev rules (Doc ID 1901622.1)

Last updated on OCTOBER 31, 2016

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
Information in this document applies to any platform.

Symptoms

Sometime you may be facing weird issue in our RAC if the diskgroup holding OCR will not mount.
This happens somtime in case of reboot of server only.
After the node reboot crsctl stat res -t -init shows "OCR not starting" having the following in the logs in crsd log:

2014-05-28 16:51:02.787: [ OCRASM][4225455904]ASM Error Stack :
2014-05-28 16:51:02.827: [ OCRASM][4225455904]proprasmo: kgfoCheckMount returned [6]
2014-05-28 16:51:02.827: [ OCRASM][4225455904]proprasmo: The ASM disk group DATA is not found or not mounted
2014-05-28 16:51:02.828: [ OCRRAW][4225455904]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2014-05-28 16:51:02.828: [ OCRRAW][4225455904]proprioo: No OCR/OLR devices are usable
2014-05-28 16:51:02.828: [ OCRASM][4225455904]proprasmcl: asmhandle is NULL
2014-05-28 16:51:02.828: [ OCRRAW][4225455904]proprinit: Could not open raw device
2014-05-28 16:51:02.828: [ OCRASM][4225455904]proprasmcl: asmhandle is NULL
2014-05-28 16:51:02.828: [ default][4225455904]a_init:7!: Backend init unsuccessful : [26]

The disks are visible in both nodes, and both nodes can read and write on the disks. What is weird here is if we use:
"crsctl stop crs -f" to stop CRS stack and after that "crsctl start crs", the error disappears and the entire CRS stack starts with no problems.


If we check the asm alert log for DATA disk holding OCR , it will be showing some disks are missing somtime when CRS will not start

alert__ASM1.log
====================
Mon May 26 15:36:33 2014
SQL> ALTER DISKGROUP ALL MOUNT /* asm agent call crs *//* {0:0:2} */
NOTE: Diskgroups listed in ASM_DISKGROUPS are
FRA
NOTE: Diskgroup used for Voting files is:
DATA
Diskgroup with spfile:DATA
Diskgroup used for OCR is:DATA
NOTE: cache began mount (not first) of group DATA number=1 incarn=0x6a791f20
.....
NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/dm-28
NOTE: cache opening disk 1 of grp 1: DATA_0001 path:/dev/dm-22
NOTE: cache opening disk 3 of grp 1: DATA_0002 path:/dev/dm-25
NOTE: cache opening disk 23 of grp 1: DATA_001 path:/dev/dm-4
NOTE: cache opening disk 24 of grp 1: DATA_002 path:/dev/dm-3
NOTE: cache opening disk 25 of grp 1: DATA_003 path:/dev/dm-5
NOTE: cache mounting (not first) external redundancy group 1/0x6A791F20 (DATA)
...
SUCCESS: diskgroup DATA was mounted


Next Startup shows some disks are missing:

NOTE: Assigning number (1,0) to disk (/dev/eql/eqlstgp11p1)
NOTE: Assigning number (1,3) to disk (/dev/eql/eqlstgp13p1)
NOTE: Assigning number (1,1) to disk (/dev/eql/eqlstgp10p1)
NOTE: Assigning number (1,23) to disk (/dev/mapper/disk3)
NOTE: Assigning number (2,0) to disk (/dev/eql/eqlstgp12p1)
NOTE: Assigning number (2,2) to disk (/dev/mapper/disk1)
GMON querying group 1 at 3 for pid 23, osid 7651
NOTE: Assigning number (1,24) to disk () <<<<<<<<<<<<
NOTE: Assigning number (1,25) to disk () <<<<<<<<<<<<
...
ERROR: diskgroup DATA was not mounted
...
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "25" is missing from group number "1"
ORA-15042: ASM disk "24" is missing from group number "1"



Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms