Exadata Database Or ASM Instance Hung On IO Request When Single Cell Is Having Problem (Doc ID 2002084.1)

Last updated on DECEMBER 18, 2015

Applies to:

Oracle Exadata Storage Server Software - Version 11.2.3.2.1 to 12.1.2.1.0 [Release 11.2 to 12.1]
Information in this document applies to any platform.

Symptoms

In Exadata we see IO  related  waits causing  Database or ASM instance to hang or perform very slow.

There will be issue with a  single cell server either all the disks will be  offline or  cellsrv process will be hung causing the IO to hang in disk subsystem.

In the Cluster logs we can see some typical errors as below -

Diskmon.log:

----------------

2015-02-26 14:49:26.033509: DISKMON:4119796032: dskm_bcast_oss6: oss_wait for device o/192.168.*.*(inc 0) did not complete in 5000 msec

 

LGWR  background trace files -

File  *_lgwr_<>.trc

----------------------

First ossnet_wait_all timeout at :

***

ORA-27626: Exadata Error :2201 (IO cancelled due to slow/hung disk)

Exadata error : 'IO cancelled due to slow/hung disk'

IO elapsed time : 213596 usec Time waited on I/O: 213596 usec

***

ossnet_wait_all:WAITED TOO LONG for network request completion :60001

init_timeout:4294967295 remaining_timeout:4294967295

Dumping State information

--

If you see above , the timeout was preceeded by I/O cancel error from the cell  . There is some problem with disk subsystem .

After the dump we shoud see the LGWR trying to reconnect and it shows   in the trace file--

connect:sosstcread failed- 1923.168.*.*, return bytes: 0

OS system dependent operation:unexpected_size3 failed with status:115

OS failure message :Operation now in progress

failure occured at: sosstcpredt

connect retry:sleeping for 2 seconds, connect 2 out of maximum 7 attempts

 

--

It is also possible that the on cell sever system disk was read only

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms