Exadata cell has I/O errors and resync of some md devices never completes (Doc ID 1328727.1)

Last updated on MAY 06, 2015

Applies to:

Oracle Exadata Storage Server Software - Version to [Release 11.2]
Information in this document applies to any platform.


Issue is only known to affect storage cells running old Exadata Storage Server Software versions (prior to The information presented here may however apply in part to isssues which do not share the same cause. 

The /var/log/messages may be reporting recurring I/O errors similar to:

scsi 0:2:0:0: rejecting I/O to dead device
scsi 0:2:0:0: rejecting I/O to dead device
raid1: sda: unrecoverable I/O read error for block 128
scsi 0:2:0:0: rejecting I/O to dead device
raid1: sda: unrecoverable I/O read error for block 256

The /proc/mdstat dynamic file giving an overview of the status of the RAID 1 arrays managed by the Multiple Device (MD) kernel driver will exhibit the following characteristics. Refer to numbered bold entries explained further below.

Personalities : [raid1]
1.>>> md4 : active raid1 sdad1[2] sda1[3](F) sdb1[1] <<<
120384 blocks [2/1] [_U]
2.>>>>> resync=DELAYED <<<

md5 : active raid1 sdad5[2] sda5[0] sdb5[3](F)
10482304 blocks [2/1] [U_]

3.>>> md6 : active raid1 sdae6[1] sdad6[0] sda6[2](F) sdb6[3](F)<<<
10482304 blocks [2/2] [UU]

md7 : active raid1 sdad7[2] sda7[3](F) sdb7[1]
2096384 blocks [2/1] [_U]
4.>>> [>....................] recovery = 0.0% (192/2096384) finish=174.6min speed=192K/sec  <<<

md8 : active raid1 sdae8[1] sdad8[0] sda8[2](F) sdb8[3](F)
2096384 blocks [2/2] [UU]

md1 : active raid1 sdad10[2] sda10[0] sdb10[3](F)
714752 blocks [2/1] [U_]

md11 : active raid1 sdae11[1] sdad11[0] sda11[2](F) sdb11[3](F)
2433728 blocks [2/2] [UU]

md2 : active raid1 sdad9[2] sda9[0] sdb9[3](F)
2096384 blocks [2/1] [U_]

unused devices: <none>

1. All existing md devices are made up of 3 or 4 member disks, such as sda, sdb, sdad, sdae. Having certain md devices with 4 member disks is a requirement.
2. For certain devices, there is a "resync=DELAYED" flag, indicating a degraded or faulty device array
3. The devices made up of 4 member disks do not have the "resync=DELAYED" flag above. Instead "[2/2] [UU]" is present, indicating a healthy, clean, synced device.
4. An attempted recovery (resync) is ongoing for one of the devices. However, the speed is very low and the completed percentage likewise. This remains constant over time.


Both system disks have been replaced, either at close interval or following a certain period of time


Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms