Exadata Storage/Cell Node Hung and Rebooted Due to Temporary IO Stall Caused by Drive Medium Errors (Doc ID 1390664.1)

Last updated on AUGUST 21, 2016

Applies to:

Exadata Database Machine V2 - Version All Versions to All Versions [Release All Releases]
Information in this document applies to any platform.
***Checked for relevance on 15-Oct-2103***

Symptoms

Exadata Storage/Cell Node has been completely hung/frozen due to temporary I/O stall caused by drive medium errors and system gets rebooted (forced power cycle).

Command "cellcli -e list alerthistory" lists following alert:

63 2011-12-29T01:42:10-02:00 info "IO hang detected on CD_09_cell06. Power cycle forced."


"ipmitool sel list" contains following event (with message "OEM record c0") at the time of failure:

214 | 12/29/2011 | 01:42:10 | OEM record c0 | 004301 | 97cd9c999865
215 | 12/29/2011 | 01:42:11 | System Boot Initiated | System Restart | Asserted


File "$CELLTRACE/ms-odl.trc" shows following error (with time stamp of after cell node reboot):

[2011-12-29T09:50:04.539-02:00] [ossmgmt] [NOTIFICATION] [] [common.hwadapter.HardwareImpl] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] Adding alert: time: 1325130130000 msg: OEM Record :: IO hang detected. Force Power cycle. Detail: 43 0 - 97 cd - 9c 99 - 98 65
[2011-12-29T09:50:04.549-02:00] [ossmgmt] [NOTIFICATION] [] [ms.hwadapter.MSHardwareImpl] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] IO hang detected on CD_09_cell06. Power cycle forced. Detail: 43 0 - 97 cd - 9c 99 - 98 65
[2011-12-29T09:50:04.550-02:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSAlertHistory] [tid: 15] [ecid: 180.128.211.98:90252:1325159256252:5,0] AlertHistory 63 created. Severity: info. Message: IO hang detected on CD_09_dm02cel13. Power cycle forced.



Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms