ORACLE PROCESSES ENCOUNTERING (OS 1117) ERRORS ON WINDOWS 2003 (Doc ID 444803.1)

Last updated on OCTOBER 19, 2010

Applies to:

Oracle Server - Enterprise Edition - Version: 9.2.0.1 to 10.2.0.1 - Release: 9.2 to 10.2
z*OBSOLETE: Microsoft Windows Server 2003
Microsoft Windows Itanium (64-bit)
Microsoft Windows x64 (64-bit)
Microsoft Windows Server 2003 (64-bit Itanium)Microsoft Windows Server 2003Microsoft Windows Server 2003 (64-bit AMD64 and Intel EM64T)

***Checked for relevance on 19-Oct-2010***

Symptoms

Oracle processes may encounter various (OS 1117) errors on a Windows 2003 Server.   The text of the (OS 1117) error can be seen as follows:

C:\>net helpmsg 1117
The request could not be performed because of an I/O device error.

This error may manifest itself in different ways, depending on which Oracle process encounters the error:

Oracle RDBMS Instance Encounters (OS 1117) error

1.  If an Oracle RDBMS instance encounters the error, you may see messages such as the following in the alert log for the RDBMS instance:

==========================================================================

Fri Jul 13 01:21:33 2007
Errors in file d:\oracle\db\product\admin\mydb\bdump\mydb1_lmon_4608.trc:
ORA-27091 : unable to queue I/O
ORA-27070 : async read/write failed
OSD-04006: ReadFile() failure, unable to read from file
O/S-Error: (OS 1117) The request could not be performed because of an I/O device error.
==========================================================================

 

Oracle ASM Instance Encounters (OS 1117) errors

2.  If an Oracle ASM instance encounters the error, you may see similar errors in the ASM instance's Alert log:

============================================
Fri Jul 13 01:22:10 2007
Errors in file d:\oracle\asm\product\admin\+asm\bdump\+asm1_gmon_3836.trc:
ORA-27091 : unable to queue I/O
ORA-27070 : async read/write failed
OSD-04016: Error queuing an asynchronous I/O request.
O/S-Error: (OS 1117) The request could not be performed because of an I/O device error.
============================================

CRS Daemon (crsd.exe) encounters 1117 errors


3.  If you are running in an Oracle Clusterware environment, then you may also see errors in the crsd.log and/or certain resource logs, indicating a problem accessing the OCR  (Oracle Cluster Registry).  An example of those errors would be:

================================================
2007-07-13 01:21:51.766: [ OCROSD][4272]utwrite:4: Problem writing the buffer phy offset 184320 and oserror 1117
2007-07-13 01:21:51.766: [ OCROSD][4352]utwrite:4: Problem writing the buffer phy offset 184320 and oserror 1117
2007-07-13 01:21:51.766: [ OCRRAW][4352]beginlog: problem 26 clearing the log metadata buffer
2007-07-13 01:21:51.766: [ OCRRAW][4352]proprdkey: Problem in begin log
2007-07-13 01:21:51.766: [ OCRRAW][4352]proprseterror: Error in accessing physical storage [26] Marking context invalid.
================================================

CSS Daemon (ocssd.exe) encounters 1117 errors

4.  Also, in an Oracle Clusterware environment, the Cluster Synchronization Services daemon (ocssd.exe)  may experience problems accessing the voting disk.  If this occurs, you will see an error in the ocssd.log similar to the following:

============================================
[ CSSD]2007-07-13 01:22:12.501 [4052] >ERROR: Internal Error Information:
Category: 1234
Operation: scls_block_write
Location: WriteFile
Other: unable to write block(s)
Dep: 1117

[ CSSD]2007-07-13 01:22:12.501 [4052] >ERROR: clssnmvReadBlocks: read failed 1 at offset 533 of \\.\votedsk2
[ CSSD]2007-07-13 01:22:12.501 [4052] >TRACE: clssnmDiskStateChange: state from 4 to 3 disk (1/\\.\votedsk2)
[ CSSD]2007-07-13 01:22:12.501 [2200] >TRACE: clssnmDiskPMT: disk offline (1/\\.\votedsk2)
[ CSSD]2007-07-13 01:22:12.501 [2200] >ERROR: clssnmDiskPMT: Aborting, 1 of 2 voting disks unavailable
[ CSSD]2007-07-13 01:22:12.501 [2200] >ERROR: ###################################
[ CSSD]2007-07-13 01:22:12.501 [2200] >ERROR: clssscExit: CSSD aborting
[ CSSD]2007-07-13 01:22:12.501 [2200] >ERROR: ###################################
==============================================

 

 5. When you are running in an Oracle Clusterware environment, if the ocssd process encounters an I/O error when accessing the Voting Disk, the CSS daemon will evict the node from the cluster.   This is done by signalling the Oracle Fence Driver  (OraFencedrv.sys)  to reboot the machine.   When the fence driver reboots the machine, this will be seen as a bugcheck with stop code 0x0000ffff.  You will be able to see this in the System Log with a message such as:

The computer has rebooted from a bugcheck. 
The bugcheck was: 0x0000ffff (0x0000000000000000, 0x0000000000000000, 
0x0000000000000000, 0x0000000000000000). 
A dump was saved in: C:\WINDOWS\MEMORY.DMP.

 

Note that the bugcheck is expected behavior when ocssd.exe (the Cluster Synchornization Services daemon) encounters an I/O error when accessing the voting disk. The node experiencing the I/O error is intentionally  rebooted to avoid a split-brain and possible data corruption when access to the voting disk is lost.

Changes

You may encounter this error after upgrading the Microsoft Storport driver to version 5.2.3790.4021 or later.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms