My Oracle Support Banner

Exadata X8M restarting network service caused the CRS interrupted (Doc ID 2699671.1)

Last updated on OCTOBER 15, 2020

Applies to:

Exadata Database Machine X8-2/X8M-2 Hardware - Version All Versions to All Versions [Release All Releases]
Oracle Database - Enterprise Edition - Version 19.0.0.0 to 19.8.0.0.0 [Release 19]
Information in this document applies to any platform.

Symptoms

--OS message --
Jun 30 10:58:03 <hostname> systemd: Stopping LSB: Bring up/down networking...
Jun 30 10:58:04 <hostname> systemd: Starting Network Service...
Jun 30 10:58:04 <hostname> kernel: bondeth0: Releasing backup interface eth1
Jun 30 10:58:04 <hostname> kernel: bondeth0: the permanent HWaddr of eth1 - 00:10:e0:f1:d5:1f - is still in use by bondeth0 - set the HWaddr of eth1 to a different address to avo
id conflicts

Jun 30 10:58:24 <hostname> network: + Buffers are configured as 32768,229120,0,0,0,0,0,0
Jun 30 10:58:24 <hostname> network: Finished configuring "re0"  <--------------------- re0 was up now, time point was 10:58:24, it takes 20s to bounce re0
Jun 30 10:58:24 <hostname> network: /sbin/ifup-local: + Non-RoCE Configuration...
Jun 30 10:58:24 <hostname> network: /sbin/ifup-local: Non-RoCE Configuration: Nothing to do for re0.

Jun 30 10:58:29 <hostname> network: + Buffers are configured as 32768,229120,0,0,0,0,0,0
Jun 30 10:58:29 <hostname> network: Finished configuring "re1"  <----------------------- here, re1 was up, it takes about 25s.
Jun 30 10:58:29 <hostname> network: /sbin/ifup-local: + Non-RoCE Configuration...
Jun 30 10:58:29 <hostname> network: /sbin/ifup-local: Non-RoCE Configuration: Nothing to do for re1.

--history ---
2020-06-30.10:58:03 systemctl restart network

--crs log file of failed node
2020-06-30 10:58:09.822 [GIPCD(36503)]CRS-42216: No interfaces are configured on the local node for interface definition re0(:.*)?:192.168.X.0: available interface definitions are [bondeth0(:.*)?:X.X.X.0].
2020-06-30 10:58:09.822 [GIPCD(36503)]CRS-42216: No interfaces are configured on the local node for interface definition re1(:.*)?:192.168.X.0: available interface definitions are [bondeth0(:.*)?:X.X.X.0].

2020-06-30 10:58:29.151 [OCSSD(48827)]CRS-1608: This node was evicted by node 1, xxxx ; details at (:CSSNM00005:) in /u01/app/grid/diag/crs/xxxxx/crs/trace/ocssd.trc.

-- gipc log of surving node

2020-06-30 10:58:06.963 : GIPCLIB:2585757440: s0skgcbt: skgzibr_check_node_reachability_by_node: [g/192.168.x.x;192.168.x.x;] Unreachable due to last active path coming down and data exchange connection up
2020-06-30 10:58:06.963 : GIPCD:2585757440: gipcdDetectFailNodes: SKGZFNDD retcode 3
2020-06-30 10:58:06.963 : GIPCD:2585757440: gipcdDetectFailNodes: Node is UN-REACHABLE
2020-06-30 10:58:06.963 : GIPCD:2585757440: gipcdDetectFailNodes: Failing, node '<hostname>', endp 0000000000000000, 0000000001789161 <--- Here, FNDD found node unreacheable on 10:58:06. so eviction happened.
2020-06-30 10:58:06.963 :GIPCDCLT:2585757440: gipcdEnqueueMsgForNode: Enqueuing for NodeThread (gipcdReqTypeFailNode)



Changes

 Null

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.