ODA (Oracle Database Appliance) Different Disks Randomly Disappear After a Reboot
(Doc ID 1420126.1)
Last updated on AUGUST 02, 2022
Applies to:
Oracle Database Appliance - Version All Versions to All Versions [Release All Releases]Oracle Database Appliance X4-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance X3-2 - Version All Versions to All Versions [Release All Releases]
Oracle Database Appliance Software - Version 2.1.0.2 to 12.1.2.9 [Release 2.1 to 12.1]
Information in this document applies to any platform.
Symptoms
<Internal_Only> <strong>FOR SUPPORT ONLY - there are many flavors of this problem across several ODA versions: Please review INTERNAL information at the bottom of those note for further discussion and information</strong> </Internal_Only>
As a result of randomly missing disks, ASM disks or groups do not come up:
- ODA and ASM is failing to identify disks during startup
- The problem symptom includes different disks randomly showing a failed, predictive failure or missing
- The problem can lead to an entire node failing to startup
NOTE: This problem was originally associated with a problem alerted in 2.1.0.0 and fixed by 2.1.0.3.1
However, the problem symptoms can be considered more generic and can happen on most any ODA version:
This does not mean you are hitting the same bug stated as fixed in 2.1.0.3.1:
It does mean that similar / same symptoms can use the same corrective actions on most all versions
However, the problem symptoms can be considered more generic and can happen on most any ODA version:
This does not mean you are hitting the same bug stated as fixed in 2.1.0.3.1:
It does mean that similar / same symptoms can use the same corrective actions on most all versions
Example:
For the 2.1.0. the symptoms were as follows:
The problematic Node has been rebooted several times and has come back up with different disks missing each time:
DATA dg - missing disks
---------------
/dev/mapper/HDD_E1_S19_993871319p1
/dev/mapper/HDD_E1_S11_1196820151p1
/dev/mapper/HDD_E0_S13_1196881379p1
/dev/mapper/HDD_E0_S04_1196963151p1
RECO dg - missing disks
---------------------
/dev/mapper/HDD_E1_S19_993871319p2
/dev/mapper/HDD_E1_S11_1196820151p2
/dev/mapper/HDD_E0_S13_1196881379p2
/dev/mapper/HDD_E0_S04_1196963151p2
Missing disks before a reboot included pd_04; pd_11; pd_13; pd_19 (RECO and DATA)
However: After rebooting the node you confirm that different disks are missing (different ones):
ASMCMD> mount all
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "21" is missing from group number "3"
ORA-15042: ASM disk "20" is missing from group number "3"
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "17" is missing from group number "2"
ORA-15042: ASM disk "16" is missing from group number "2"
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "17" is missing from group number "1"
ORA-15042: ASM disk "16" is missing from group number "1" (DBD ERROR:
OCIStmtExecute)
After reboot after a reboot included pd_21; pd_22 (REDO); and pd_16; pd_17 (RECO and DATA)
In this Document
The problematic Node has been rebooted several times and has come back up with different disks missing each time:
DATA dg - missing disks
---------------
/dev/mapper/HDD_E1_S19_993871319p1
/dev/mapper/HDD_E1_S11_1196820151p1
/dev/mapper/HDD_E0_S13_1196881379p1
/dev/mapper/HDD_E0_S04_1196963151p1
RECO dg - missing disks
---------------------
/dev/mapper/HDD_E1_S19_993871319p2
/dev/mapper/HDD_E1_S11_1196820151p2
/dev/mapper/HDD_E0_S13_1196881379p2
/dev/mapper/HDD_E0_S04_1196963151p2
Missing disks before a reboot included pd_04; pd_11; pd_13; pd_19 (RECO and DATA)
However: After rebooting the node you confirm that different disks are missing (different ones):
ASMCMD> mount all
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "21" is missing from group number "3"
ORA-15042: ASM disk "20" is missing from group number "3"
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "17" is missing from group number "2"
ORA-15042: ASM disk "16" is missing from group number "2"
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "17" is missing from group number "1"
ORA-15042: ASM disk "16" is missing from group number "1" (DBD ERROR:
OCIStmtExecute)
After reboot after a reboot included pd_21; pd_22 (REDO); and pd_16; pd_17 (RECO and DATA)
Changes
This problem can occur after:
- One ASM disk is lost
- Replacing a disk
- Reboot (of one server)
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |