Disk Dropped from ASM DiskGroup but corresponding Physical Disk Online (Doc ID 1499412.1)

Last updated on MAY 13, 2014

Applies to:

Oracle Appliance Kit - Version 2.1.0.1 to 2.3.0.0 [Release 2.1 to 2.3]
Information in this document applies to any platform.

Symptoms

ASM Alert log will indicate that the disk is being forcibly dropped

WARNING: Disk 8 (HDD_E0_S08_982514475P1) in group 1 will be dropped in: (966) secs on ASM inst 2
Fri Oct 05 12:52:16 2012
WARNING: Disk 8 (HDD_E0_S08_982514475P1) in group 1 will be dropped in: (781) secs on ASM inst 2
Fri Oct 05 12:55:20 2012
WARNING: Disk 8 (HDD_E0_S08_982514475P1) in group 1 will be dropped in: (598) secs on ASM inst 2
Fri Oct 05 12:58:24 2012
WARNING: Disk 8 (HDD_E0_S08_982514475P1) in group 1 will be dropped in: (413) secs on ASM inst 2
Fri Oct 05 13:01:28 2012
WARNING: Disk 8 (HDD_E0_S08_982514475P1) in group 1 will be dropped in: (229) secs on ASM inst 2
Fri Oct 05 13:04:32 2012
WARNING: Disk 8 (HDD_E0_S08_982514475P1) in group 1 will be dropped in: (46) secs on ASM inst 2
Fri Oct 05 13:07:36 2012
WARNING: PST-initiated drop of 1 disk(s) in group 1(.789102847))
SQL> alter diskgroup DATA drop disk HDD_E0_S08_982514475P1 force /* ASM SERVER */

 

v$asm_disk shows the one partition as Missing while the other is available of the same disk.

HDD_E0_S01_977589867P2    /dev/mapper/HDD_E0_S01_977589867p2       MEMBER       CACHED
HDD_E0_S04_982478879P1    /dev/mapper/HDD_E0_S04_982478879p1       MEMBER       CACHED
HDD_E0_S04_982478879P2    /dev/mapper/HDD_E0_S04_982478879p2       MEMBER       CACHED
HDD_E0_S05_982491883P1    /dev/mapper/HDD_E0_S05_982491883p1       MEMBER       CACHED
HDD_E0_S05_982491883P2    /dev/mapper/HDD_E0_S05_982491883p2       MEMBER       CACHED
HDD_E0_S08_982514475P1                                             UNKNOWN      MISSING                      <=======
HDD_E0_S08_982514475P2    /dev/mapper/HDD_E0_S08_982514475p2       MEMBER       CACHED <=======
HDD_E0_S09_982490391P1    /dev/mapper/HDD_E0_S09_982490391p1       MEMBER       CACHED
HDD_E0_S09_982490391P2    /dev/mapper/HDD_E0_S09_982490391p2       MEMBER       CACHED
HDD_E0_S12_977569771P1    /dev/mapper/HDD_E0_S12_977569771p1       MEMBER       CACHED

 

The messages File Indicate I/O error and the disk failed and reinstated

Oct  5 09:27:25 aa-odadb-zzz-01 kernel: sd 6:0:2:0: [sde] Unhandled sense code
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: sd 6:0:2:0: [sde] Result: hosaayte=DID_OK driverbyte=DRIVER_SENSE
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: sd 6:0:2:0: [sde] Sense Key : Medium Error [current]
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: Info fld=0x633c3
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: sd 6:0:2:0: [sde] Add. Sense: Unrecovered read error
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: sd 6:0:2:0: [sde] CDB: Read(10): 28 00 00 06 33 c0 00 00 40 00
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: end_request: I/O error, dev sde, sector 406467
Oct  5 09:27:25 aa-odadb-zzz-01 kernel: device-mapper: multipath: Failing path 8:64.
Oct  5 09:27:25 aa-odadb-zzz-01 multipathd: 8:64: mark as failed <=============
Oct  5 09:27:25 aa-odadb-zzz-01 multipathd: HDD_E0_S08_982514475: remaining active paths: 1
Oct  5 09:27:25 aa-odadb-zzz-01 multipathd: dm-6: add map (uevent)
Oct  5 09:27:25 aa-odadb-zzz-01 multipathd: dm-6: devmap already registered
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: sd 7:0:15:0: [sdao] Unhandled sense code
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: sd 7:0:15:0: [sdao] Result: hosaayte=DID_OK driverbyte=DRIVER_SENSE
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: sd 7:0:15:0: [sdao] Sense Key : Medium Error [current]
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: Info fld=0x633c3
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: sd 7:0:15:0: [sdao] Add. Sense: Unrecovered read error
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: sd 7:0:15:0: [sdao] CDB: Read(10): 28 00 00 06 33 c0 00 00 40 00
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: end_request: I/O error, dev sdao, sector 406467
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: device-mapper: multipath: Failing path 66:128.
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: end_request: I/O error, dev dm-6, sector 406464
Oct  5 09:27:27 aa-odadb-zzz-01 multipathd: 66:128: mark as failed
Oct  5 09:27:27 aa-odadb-zzz-01 multipathd: HDD_E0_S08_982514475: remaining active paths: 0
Oct  5 09:27:27 aa-odadb-zzz-01 multipathd: dm-6: add map (uevent)
Oct  5 09:27:27 aa-odadb-zzz-01 multipathd: dm-6: devmap already registered
Oct  5 09:27:27 aa-odadb-zzz-01 kernel: end_request: I/O error, dev dm-6, sector 4096
Oct  5 09:27:29 aa-odadb-zzz-01 multipathd: HDD_E0_S08_982514475: sde - tur checker reports path is up
Oct  5 09:27:29 aa-odadb-zzz-01 multipathd: 8:64: reinstated <=============
Oct  5 09:27:29 aa-odadb-zzz-01 multipathd: HDD_E0_S08_982514475: remaining active paths: 1
Oct  5 09:27:29 aa-odadb-zzz-01 multipathd: dm-6: add map (uevent)
Oct  5 09:27:29 aa-odadb-zzz-01 multipathd: dm-6: devmap already registered
Oct  5 09:27:31 aa-odadb-zzz-01 multipathd: HDD_E0_S08_982514475: sdao - tur checker reports path is up
Oct  5 09:27:31 aa-odadb-zzz-01 multipathd: 66:128: reinstated
Oct  5 09:27:31 aa-odadb-zzz-01 multipathd: HDD_E0_S08_982514475: remaining active paths: 2 <=============
Oct  5 09:27:31 aa-odadb-zzz-01 multipathd: dm-6: add map (uevent)
Oct  5 09:27:31 aa-odadb-zzz-01 multipathd: dm-6: devmap already registered

 

The ODA sundiag output shows the status of the disk as Good and Online

Disk: pd_08
ActionTimeout   : 600            
ActivePath      : /dev/sdao      
AsmDiskList     : |data_08||reco_08|
AutoDiscovery   : 1              
AutoDiscoveryHi : |data:43:HDD||reco:57:HDD||redo:100
CheckInterval   : 100            
ColNum          : 0              
DiskId          : 35000c5003a8ffb2b
DiskType        : HDD            
Enabled         : 0              
ExpNum          : 0              
IState          : 0              
Initialized     : 0              
MonitorFlag     : 0              
MultiPathList   : |/dev/sdao||/dev/sde|
Name            : pd_08          
NewPartAddr     : 0              
PrevState       : 0              
PrevUsrDevName  :              
SectorSize      : 512            
SerialNum       : 001120E12Y8R  
Size            : 600127266304  
SlotNum         : 8              
State           : Online        
StateChangeTs   : 1345188420    
StateDetails    : Good          
TotalSectors    : 1172123567    
TypeName        : 0              
UsrDevName      : HDD_E0_S08_982514475
gid             : 0              
mode            : 660            
uid             : 0

 
Multipath status (# multipath -l) of the disk is shown as active

HDD_E0_S08_982514475 (35000c5003a8ffb2b) dm-6 SEAGATE,ST360057SSUN600G
size=559G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=-1 status=enabled
| `- 6:0:2:0  sde  8:64   active undef running
`-+- policy='round-robin 0' prio=-1 status=active
 `- 7:0:15:0 sdao 66:128 active undef running

 

The SMART Health Status (# smartctl -a /dev/sdao) of the disk is shown as OK

Device: SEAGATE  ST360057SSUN600G Version: 0B25
Serial number: 001120E12Y8R        6SL12Y8R
Device type: disk
Transport protocol: SAS
Local Time is: Wed Oct 10 08:06:58 2012 CEST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK

Current Drive Temperature:     32 C
Drive Trip Temperature:        68 C
Manufactured in week 20 of year 2011
Recommended maximum start stop count:  10000 times
Current start stop count:      30 times
Elements in grown defect list: 0
Vendor (Seagate) cache information
 Blocks sent to initiator = 317041797
 Blocks received from initiator = 3732378198
 Blocks read from cache and sent to initiator = 269932414
 Number of read and write commands whose size <= segment size = 11370797
 Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
 number of hours powered up = 1856.20
 number of minutes until next internal SMART test = 58

Error counter log:
          Errors Corrected by           Total   Correction     Gigabytes    Total
              ECC          rereads/    errors   algorithm      processed    uncorrected
          fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:    8752552        0         0   8752552    8752556        162.325           4
write:         0        0         0         0          0       1914.656           0

Non-medium error count:        0
No self-tests have been logged
Long (extended) Self Test duration: 6400 seconds [106.7 minutes]

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms