Exadata: Clusterware CSSD may report Voting IO failures and reboot the DB nodes when the storage cells performing MD Resync Checks. (Doc ID 1966103.1)

Last updated on JUNE 02, 2015

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.3 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.

Symptoms

Clusterware CSSD may experienence Voting file IO failures and reboot the database nodes when the weekly MD resync check going on at the storage cells. 

<GRID_HOME>/log/<hostname>/cssd/ocssd.log

2014-12-21 04:27:22.540: [ CSSD][314857792]clssnmvGetDiskHandle: Unable to open disk o/192.168.10.148;192.168.10.149/DATA_DG_CD_00_exacel02 <<<
2014-12-21 04:27:22.540: [ CSSD][314857792]clssnmvDiskOpen:failed to open o/192.168.10.148;192.168.10.149/DATA_DG_CD_00_exacel02

2014-12-21 04:27:23.548: [ CSSD][300665152]clssnmvGetDiskHandle: Unable to open disk o/192.168.10.154;192.168.10.155/DATA_DG_CD_00_exacel05 <<<
2014-12-21 04:27:23.548: [ CSSD][300665152]clssnmvDiskOpen:failed to open o/192.168.10.154;192.168.10.155/DATA_DG_CD_00_exacel05

2014-12-21 04:27:23.654: [ CSSD][319793472]clssnmvGetDiskHandle: Unable to open disk o/192.168.10.146;192.168.10.147/DATA_DG_CD_00_exacel01 <<<
2014-12-21 04:27:23.654: [ CSSD][319793472]clssnmvDiskOpen:failed to open o/192.168.10.146;192.168.10.147/DATA_DG_CD_00_exacel01

2014-12-21 04:27:26.541: [ CSSD][295934272](:CSSNM00018:)clssnmvDiskCheck: Aborting, 2 of 5 configured voting disks available, need 3 <<<

2014-12-21 04:27:26.541: [ CSSD][295934272]###################################
2014-12-21 04:27:26.541: [ CSSD][295934272]clssscExit: CSSD aborting from thread clssnmvDiskPingMonitorThread <<<<<
2014-12-21 04:27:26.541: [ CSSD][295934272]###################################

2014-12-21 04:27:26.541: [ CSSD][295934272](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally

During the same time frame, cell OS message file indicate that the MD resync checks are going on.

/var/log/messages:

Dec 21 04:22:01 exacel01 kernel: md: data-check of RAID array md4 <<<<<<
Dec 21 04:22:01 exacel01 kernel: md: minimum _guaranteed_ speed: 50000 KB/sec/disk.
Dec 21 04:22:01 exacel01 kernel: md: using maximum available idle IO bandwidth (but not more than 100000 KB/sec) for data-check.
Dec 21 04:22:01 exacel01 kernel: md: using 128k window, over a total of 120384k.
Dec 21 04:22:01 exacel01 kernel: md: delaying data-check of md5 until md4 has finished (they share one or more physical units)
Dec 21 04:22:01 exacel01 kernel: md: delaying data-check of md6 until md4 has finished (they share one or more physical units)

  

Changes

 Moved/Configured the OCR/Vote files on the DATA or RECO diskgroups.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms