My Oracle Support Banner

11gR2 Grid Infrastructure Node May not Join the Cluster After Evicted With Error sgipcnUdpSend "No buffer space available (74)" (Doc ID 1352887.1)

Last updated on FEBRUARY 03, 2019

Applies to:

Oracle Server - Enterprise Edition - Version 11.2.0.2 to 11.2.0.3 [Release 11.2]
Information in this document applies to any platform.

Symptoms

11gR2 Grid Infrastructure node gets evicted and not joining the cluster.

In this case, it's a 2-node cluster:

..
2011-06-30 23:16:15.853
[cssd(4522550)]CRS-1610:Network communication with node racnode2 (2) missing for 90% of timeout interval.  Removal of this node from cluster in 2.215 seconds
2011-06-30 23:16:18.069
[cssd(4522550)]CRS-1607:Node racnode2 is being evicted in cluster incarnation 200686974; details at (:CSSNM00007:) in /ocw/grid/log/racnode1/cssd/ocssd.log
.
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: [network]  failed send attempt endp 11150d5f0 [0000000000000353] { gipcEndpoint : localAddr 'udp://192.168.1.31:33166', remoteAddr '', numPend 5, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, req 113b254d0 [000000000040009b] { gipcSendRequest : addr 'udp://192.168.1.33:33063', data 1139954d8, len 228, olen 0, parentEndp 11150d5f0, ret gipcretFail (1), objFlags 0x0, reqFlags 0x2 }
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos op  :  sgipcnUdpSend
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos dep :  No buffer space available (74)   ==>> key rediscovery error
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos loc :  sendto
2011-06-30 23:15:48.806: [ GIPCNET][1543] gipcmodNetworkProcessSend: slos info:  dwRet 4294967295, addr '192.168.1.33:33063', len 228, buf 1139954d8, cookie 113b254d0
2011-06-30 23:15:48.807: [GIPCXCPT][1543] gipcInternalSendSync: failed sync request, ret gipcretFail (1)
2011-06-30 23:15:48.807: [GIPCXCPT][1543] gipcSendSyncF [gipchaLowerInternalSend : gipchaLower.c : 776]: EXCEPTION[ ret gipcretFail (1) ]  failed to send on endp 11150d5f0 [0000000000000353] { gipcEndpoint : localAddr 'udp://192.168.1.31:33166', remoteAddr '', numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, addr 111bd2210 [00000000000005a4] { gipcAddress : name 'udp://192.168.1.33:33063', objFlags 0x0, addrFlags 0x1 }, buf 1139954d8, len 228, flags 0x0
2011-06-30 23:15:48.807: [GIPCHGEN][1543] gipchaInterfaceFail: marking interface failing 11155c990 { host 'racnode2', haName 'CSS_fcrprd', local 111501470, ip '192.168.1.33:33063', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 0, idxBoot 1, flags 0x6 }
2011-06-30 23:15:48.807: [GIPCHALO][1543] gipchaLowerInternalSend: failed to initiate send on interface 11155c990 { host 'racnode2', haName 'CSS_fcrprd', local 111501470, ip '192.168.1.33:33063', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 0, idxBoot 1, flags 0x86 }, hctx 1106dc7f0 [0000000000000010] { gipchaContext : host 'racnode1', name 'CSS_fcrprd', luid '9f9bc4e8-00000000', numNode 1, numInf 1, usrFlags 0x0, flags 0x7 }
2011-06-30 23:15:48.818: [GIPCHGEN][1543] gipchaInterfaceDisable: disabling interface 111501470 { host '', haName 'CSS_fcrprd', local 0, ip '192.168.1.31', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 1, idxBoot 0, flags 0x14d }
2011-06-30 23:15:48.819: [GIPCHGEN][1543] gipchaInterfaceDisable: disabling interface 11155c990 { host 'racnode2', haName 'CSS_fcrprd', local 111501470, ip '192.168.1.33:33063', subnet '192.168.1.0', mask '255.255.255.0', numRef 0, numFail 0, idxBoot 1, flags 0x86 }
..
..
2011-06-30 23:16:15.853: [    CSSD][5157]clssnmPollingThread: node racnode2 (2) at 90% heartbeat fatal, removal in 2.215 seconds, seedhbimpd 1
..
2011-06-30 23:25:44.580: [GIPCHALO][1543] gipchaLowerProcessNode: no valid interfaces found to node for 595773 ms, node 111c093d0 { host 'racnode2', haName 'CSS_fcrprd', srcLuid 9f9bc4e8-26101e05, dstLuid 3559ec4f-06cd6c73 numInf 0, contigSeq 124472, lastAck 124393, lastValidAck 124471, sendSeq [125035 : 125035], createTime 1729131589, flags 0x2408 }
..
2011-06-30 23:25:48.071: [GIPCHALO][1543] gipchaLowerProcessNode: bootstrap node considered dead because of idle connection time 600001 ms, node 111c093d0 { host 'racnode2', haName 'CSS_fcrprd', srcLuid 9f9bc4e8-26101e05, dstLuid 3559ec4f-06cd6c73 numInf 0, contigSeq 124472, lastAck 124393, lastValidAck 124471, sendSeq [125038 : 125038], createTime 1729131589, flags 0x2408 }
2011-06-30 23:16:15.184
[cssd(3736340)]CRS-1610:Network communication with node racnode1 (1) missing for 90% of timeout interval.  Removal of this node from cluster in 2.623 seconds
2011-06-30 23:16:17.817
[cssd(3736340)]CRS-1609:This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in /ocw/grid/log/racnode2/cssd/ocssd.log.
2011-06-30 23:16:17.824
[cssd(3736340)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /ocw/grid/log/racnode2/cssd/ocssd.log
..
2011-06-30 23:16:28.515
[cssd(3932432)]CRS-1713:CSSD daemon is started in clustered mode
..
2011-06-30 23:26:26.844
[/ocw/grid/bin/cssdagent(2556182)]CRS-5818:Aborted command 'start for resource: ora.cssd 1 1' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:8:3} in /ocw/grid/log/racnode2/agent/ohasd/oracssdagent_root/oracssdagent_root.log.
2011-06-30 23:26:26.845
[cssd(3932432)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /ocw/grid/log/racnode2/cssd/ocssd.log
2011-06-30 23:16:15.184: [    CSSD][5157]clssnmPollingThread: node racnode1 (1) at 90% heartbeat fatal, removal in 2.623 seconds, seedhbimpd 1
..
2011-06-30 23:16:17.824: [    CSSD][5671]clssscExit: CSSD aborting from thread clssnmRcfgMgrThread
..
2011-06-30 23:16:28.513: [    CSSD][1]clssscmain: Starting CSS daemon, version 11.2.0.2.0, in (clustered) mode with uniqueness value 1309446988
..
2011-06-30 23:26:26.802: [    CSSD][3358]clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 200686975, wrtcnt, 12048422, LATS 1817715500, lastSeqNo 12048419, uniqueness 1309362657, timestamp 1309447586/1813860276
2011-06-30 23:26:26.845: [    CSSD][1029]clssgmExecuteClientRequest: MAINT recvd from proc 2 (11126ef10)
2011-06-30 23:26:26.845: [    CSSD][1029]clssgmShutDown: Received abortive shutdown request from client.
2011-06-30 23:26:26.845: [    CSSD][1029]###################################
2011-06-30 23:26:26.845: [    CSSD][1029]clssscExit: CSSD aborting from thread GMClientListener


Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.