Resolving Instance Evictions on Windows Platforms (Doc ID 297498.1)

Last updated on AUGUST 11, 2015

Applies to:

Oracle Database - Enterprise Edition - Version 9.2.0.1 to 10.2.0.1 [Release 9.2 to 10.2]
Generic Windows

Symptoms

Instance evictions when running Real Application Clusters (RAC) in a Windows environment may become more frequent due to kernel memory exhaustion when /3GB switch is used on Windows Platforms.  These instance evictions may result in errors such as the following error stack, seen in the alert log of an evicted instance:

============
Wed Feb 02 08:29:26 2005
Errors in file e:\oracle\admin\racdb\bdump\racdb1_lmon_2416.trc:
ORA-27508 : IPC error sending a message
ORA-27300 : OS system dependent operation:IPCSOCK_Send failed with status: 10054
ORA-27301 : OS failure message: An existing connection was forcibly closed by the remote host.
ORA-27302 : failure occurred at: send_3
============

While these errors are usually the first errors noticed in the case of an instance eviction, they are generally a consequence of additional underlying errors, which may be visible in the CM log or CSS log of the evicting node.

In some cases an ORA-29740 error may also be observed:

ORA-29740 - Instance evicted by member %s, group incarnation %s

Note: -
On Windows, in Oracle9i, the cmsrvr logfile can be found in C:\WINNT\SYSTEM32\osd9i. 
In Oracle10g, you should check the CSS logfile, found in the ORA_CRS_HOME\css\log directory.

In this Oracle9i RAC example, the CM log for Node2 shows the first errors at 8:28, approximately 1 minute before the errors that were previously noted in the alert log on Node1:

>ERROR:    OemRecvMsg(): recvFrom failed size(-1) pktSize(31) - (10055), tid = ClusterListener:1884 file = oem.c, line = 782 {08:28:25 02/02/05}
>ERROR:    OemRecvMsg(): recvFrom failed size(-1) pktSize(31) - (10055), tid = ClusterListener:1884 file = oem.c, line = 782 {08:29:25 02/02/05}

In these cases, the two most meaningful errors are the operating system errors:


10054 - as seen in the alert log of the evicted instance.
10055 - as seen in the CM log of the evicting node.

The text of these error messages can be found using the ‘net helpmsg’ facility:

D:\>net helpmsg 10054
An existing connection was forcibly closed by the remote host.

D:\>net helpmsg 10055
An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.

In addition, an OS 1450 error may be seen in some circumstances:

D:\>net helpmsg 1450
Insufficient system resources exist to complete the requested service.

Changes

32-Bit Windows Environments

These errors may become more prevalent if the system is under heavy load - either high CPU utilization or high I/O utilization.  However, this does not necessarily hold true in all cases.  A slow memory increase over time could lead to these errors, even when there is no apparent load on the system.

These errors are more frequent on systems with the /3GB switch in place in the boot.ini.

In most cases, the 10054 error simply means that the connection was terminated.  The key errors are the OS 10055 error, or the OS 1450 error, which occur due to a lack of system resources.   Troubleshooting these instance evictions should begin by diagnosing why the OS 10055 or 1450 errors occur, and which ‘system resources’ are running low.

64-Bit Windows Environments

Some versions of the 10gR2 Release Notes for 64-Bit Windows reference this note incorrectly.  This note is specifically geared towards troubleshooting resource problems that occur in a 32-Bit Windows environment.  The issues, diagnostics and troubleshooting methods listed in this note are not applicable to 64-Bit Windows environments because of the fact that the address space in 64-Bit environments is in the terrabytes.   Therefore, user address space and kernel address space are only limited by the amount of physical memory in the machine. 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms