Exadata X8M restarting network service caused the CRS interrupted
(Doc ID 2699671.1)
Last updated on OCTOBER 15, 2020
Applies to:
Exadata Database Machine X8-2/X8M-2 Hardware - Version All Versions to All Versions [Release All Releases]Oracle Database - Enterprise Edition - Version 19.0.0.0 to 19.8.0.0.0 [Release 19]
Information in this document applies to any platform.
Symptoms
--OS message --
Jun 30 10:58:03 <hostname> systemd: Stopping LSB: Bring up/down networking...
Jun 30 10:58:04 <hostname> systemd: Starting Network Service...
Jun 30 10:58:04 <hostname> kernel: bondeth0: Releasing backup interface eth1
Jun 30 10:58:04 <hostname> kernel: bondeth0: the permanent HWaddr of eth1 - 00:10:e0:f1:d5:1f - is still in use by bondeth0 - set the HWaddr of eth1 to a different address to avo
id conflicts
Jun 30 10:58:24 <hostname> network: + Buffers are configured as 32768,229120,0,0,0,0,0,0
Jun 30 10:58:24 <hostname> network: Finished configuring "re0" <--------------------- re0 was up now, time point was 10:58:24, it takes 20s to bounce re0
Jun 30 10:58:24 <hostname> network: /sbin/ifup-local: + Non-RoCE Configuration...
Jun 30 10:58:24 <hostname> network: /sbin/ifup-local: Non-RoCE Configuration: Nothing to do for re0.
Jun 30 10:58:29 <hostname> network: + Buffers are configured as 32768,229120,0,0,0,0,0,0
Jun 30 10:58:29 <hostname> network: Finished configuring "re1" <----------------------- here, re1 was up, it takes about 25s.
Jun 30 10:58:29 <hostname> network: /sbin/ifup-local: + Non-RoCE Configuration...
Jun 30 10:58:29 <hostname> network: /sbin/ifup-local: Non-RoCE Configuration: Nothing to do for re1.
--history ---
2020-06-30.10:58:03 systemctl restart network
--crs log file of failed node
2020-06-30 10:58:09.822 [GIPCD(36503)]CRS-42216: No interfaces are configured on the local node for interface definition re0(:.*)?:192.168.X.0: available interface definitions are [bondeth0(:.*)?:X.X.X.0].
2020-06-30 10:58:09.822 [GIPCD(36503)]CRS-42216: No interfaces are configured on the local node for interface definition re1(:.*)?:192.168.X.0: available interface definitions are [bondeth0(:.*)?:X.X.X.0].
2020-06-30 10:58:29.151 [OCSSD(48827)]CRS-1608: This node was evicted by node 1, xxxx ; details at (:CSSNM00005:) in /u01/app/grid/diag/crs/xxxxx/crs/trace/ocssd.trc.
-- gipc log of surving node
2020-06-30 10:58:06.963 : GIPCLIB:2585757440: s0skgcbt: skgzibr_check_node_reachability_by_node: [g/192.168.x.x;192.168.x.x;] Unreachable due to last active path coming down and data exchange connection up
2020-06-30 10:58:06.963 : GIPCD:2585757440: gipcdDetectFailNodes: SKGZFNDD retcode 3
2020-06-30 10:58:06.963 : GIPCD:2585757440: gipcdDetectFailNodes: Node is UN-REACHABLE
2020-06-30 10:58:06.963 : GIPCD:2585757440: gipcdDetectFailNodes: Failing, node '<hostname>', endp 0000000000000000, 0000000001789161 <--- Here, FNDD found node unreacheable on 10:58:06. so eviction happened.
2020-06-30 10:58:06.963 :GIPCDCLT:2585757440: gipcdEnqueueMsgForNode: Enqueuing for NodeThread (gipcdReqTypeFailNode)
Changes
Null
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |