11gR2 Grid Infrastructure Node May not Join the Cluster After Evicted With Error sgipcnUdpSend "No buffer space available (74)" (Doc ID 1352887.1)

Last updated on SEPTEMBER 06, 2012

Applies to:

Oracle Server - Enterprise Edition - Version 11.2.0.2 to 11.2.0.3 [Release 11.2]
Information in this document applies to any platform.

Symptoms

11gR2 Grid Infrastructure node gets evicted and not joining the cluster.

In this case, it's a 2-node cluster:

..
2011-06-30 23:16:15.853
[cssd(4522550)]CRS-1610:Network communication with node racnode2 (2) missing for 90% of timeout interval.  Removal of this node from cluster in 2.215 seconds
2011-06-30 23:16:18.069
[cssd(4522550)]CRS-1607:Node racnode2 is being evicted in cluster incarnation 200686974; details at (:CSSNM00007:) in /ocw/grid/log/racnode1/cssd/ocssd.log
.
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: [network]  failed send attempt endp 11150d5f0 [0000000000000353] { gipcEndpoint : localAddr 'udp://192.168.1.31:33166', remoteAddr '', numPend 5, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, req 113b254d0 [000000000040009b] { gipcSendRequest : addr 'udp://192.168.1.33:33063', data 1139954d8, len 228, olen 0, parentEndp 11150d5f0, ret gipcretFail (1), objFlags 0x0, reqFlags 0x2 }
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos op  :  sgipcnUdpSend
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos dep :  No buffer space available (74)   ==>> key rediscovery error
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos loc :  sendto
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos info:  dwRet 4294967295, addr '192.168.1.33:33063', len 228, buf 1139954d8, cookie 113b254d0
2011-06-30 23:15:48.807: [GIPCXCPT][1543] gipcInternalSendSync: failed sync request, ret gipcretFail (1)
2011-06-30 23:15:48.807: [GIPCXCPT][1543] gipcSendSyncF [gipchaLowerInternalSend : gipchaLower.c : 776]: EXCEPTION[ ret gipcretFail (1) ]  failed to send on endp 11150d5f0 [0000000000000353] { gipcEndpoint : localAddr 'udp://192.168.1.31:33166', remoteAddr '', numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, addr 111bd2210 [00000000000005a4] { gipcAddress : name 'udp://192.168.1.33:33063', objFlags 0x0, addrFlags 0x1 }, buf 1139954d8, len 228, flags 0x0
2011-06-30 23:15:48.807: [GIPCHGEN][1543] gipchaInterfaceFail: marking interface failing 11155c990 { host 'racnode2', haName 'CSS_fcrprd', local 111501470, ip '192.168.1.33:33063', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 0, idxBoot 1, flags 0x6 }
2011-06-30 23:15:48.807: [GIPCHALO][1543] gipchaLowerInternalSend: failed to initiate send on interface 11155c990 { host 'racnode2', haName 'CSS_fcrprd', local 111501470, ip '192.168.1.33:33063', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 0, idxBoot 1, flags 0x86 }, hctx 1106dc7f0 [0000000000000010] { gipchaContext : host 'racnode1', name 'CSS_fcrprd', luid '9f9bc4e8-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x7 }
2011-06-30 23:15:48.818: [GIPCHGEN][1543] gipchaInterfaceDisable: disabling interface 111501470 { host '', haName 'CSS_fcrprd', local 0, ip '192.168.1.31', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 1, idxBoot 0, flags 0x14d }
2011-06-30 23:15:48.819: [GIPCHGEN][1543] gipchaInterfaceDisable: disabling interface 11155c990 { host 'racnode2', haName 'CSS_fcrprd', local 111501470, ip '192.168.1.33:33063', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 0, idxBoot 1, flags 0x86 }
..
..
2011-06-30 23:16:15.853: [    CSSD][5157]clssnmPollingThread: node racnode2 (2) at 90% heartbeat fatal, removal in 2.215 seconds, seedhbimpd 1
..
2011-06-30 23:25:44.580: [GIPCHALO][1543] gipchaLowerProcessNode: no valid interfaces found to node for 595773 ms, node 111c093d0 { host 'racnode2', haName 'CSS_fcrprd', srcLuid 9f9bc4e8-26101e05, dstLuid 3559ec4f-06cd6c73 numInf 0, contigSeq 124472, lastAck 124393, lastValidAck 124471, sendSeq [125035 : 125035], createTime 1729131589, flags 0x2408 }
..
2011-06-30 23:25:48.071: [GIPCHALO][1543] gipchaLowerProcessNode: bootstrap node considered dead because of idle connection time 600001 ms, node 111c093d0 { host 'racnode2', haName 'CSS_fcrprd', srcLuid 9f9bc4e8-26101e05, dstLuid 3559ec4f-06cd6c73 numInf 0, contigSeq 124472, lastAck 124393, lastValidAck 124471, sendSeq [125038 : 125038], createTime 1729131589, flags 0x2408 }
2011-06-30 23:16:15.184
[cssd(3736340)]CRS-1610:Network communication with node racnode1 (1) missing for 90% of timeout interval.  Removal of this node from cluster in 2.623 seconds
2011-06-30 23:16:17.817
[cssd(3736340)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /ocw/grid/log/racnode2/cssd/ocssd.log.
2011-06-30 23:16:17.824
[cssd(3736340)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /ocw/grid/log/racnode2/cssd/ocssd.log
..
2011-06-30 23:16:28.515
[cssd(3932432)]CRS-1713:CSSD daemon is started in clustered mode
..
2011-06-30 23:26:26.844
[/ocw/grid/bin/cssdagent(2556182)]CRS-5818:Aborted command 'start for resource: ora.cssd 1 1' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:8:3} in /ocw/grid/log/racnode2/agent/ohasd/oracssdagent_root/oracssdagent_root.log.
2011-06-30 23:26:26.845
[cssd(3932432)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /ocw/grid/log/racnode2/cssd/ocssd.log
2011-06-30 23:16:15.184: [    CSSD][5157]clssnmPollingThread: node racnode1 (1) at 90% heartbeat fatal, removal in 2.623 seconds, seedhbimpd 1
..
2011-06-30 23:16:17.824: [    CSSD][5671]clssscExit: CSSD aborting from thread clssnmRcfgMgrThread
..
2011-06-30 23:16:28.513: [    CSSD][1]clssscmain: Starting CSS daemon, version 11.2.0.2.0, in (clustered) mode with uniqueness value 1309446988
..
2011-06-30 23:26:26.802: [    CSSD][3358]clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 200686975, wrtcnt, 12048422, LATS 1817715500, lastSeqNo 12048419, uniqueness 1309362657, timestamp 1309447586/1813860276
2011-06-30 23:26:26.845: [    CSSD][1029]clssgmExecuteClientRequest: MAINT recvd from proc 2 (11126ef10)
2011-06-30 23:26:26.845: [    CSSD][1029]clssgmShutDown: Received abortive shutdown request from client.
2011-06-30 23:26:26.845: [    CSSD][1029]###################################
2011-06-30 23:26:26.845: [    CSSD][1029]clssscExit: CSSD aborting from thread GMClientListener


Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms