RAC: GC BLOCKS LOST continues to increase when NIC configured with load-balancing
(Doc ID 2541818.1)
Last updated on APRIL 17, 2023
Applies to:
Oracle Database - Enterprise Edition - Version 12.1.0.2 and laterInformation in this document applies to any platform.
Symptoms
GC BLOCKS LOST continues to increase, but network diagnostics show no errors.
---------- -------------- -------------------- ------------------- ----------
1 628053 6375554 5792311 .051615711
2 737706 13895256 6086116 .036919687
3 1105 852349 556890 .000784111
4 24277 11589436 7694720 .001258909
AWR - "gc blocks lost" (Statistic) totals 20,411 and "gc cr block lost" in Top Timed Events with Avg Wait Time of 139 ms.
And caused node eviction:
\var\log\messages
Mar 31 17:16:49 <RACNODE1> kernel: pcieport 0000:80:02.0: <Interface1>: Link is down
Mar 31 17:16:49 <RACNODE1> kernel: bonding: bond1: link status definitely down for interface <Interface1>, disabling it
Mar 31 17:16:49 <RACNODE1> kernel: bonding: bond1: now running without any active interface !
Mar 31 17:17:08 <RACNODE1> init.tfa: Checking/Starting TFA..
Mar 31 17:17:20 <RACNODE1> ntpd[9344]: Deleting interface #11 bond0:3, <IP=##.#.###.123>#123, interface stats: received=0, sent=0, dropped=0, active_time=323042 secs
Mar 31 17:17:20 <RACNODE1> ntpd[9344]: Deleting interface #9 bond0:1, <IP=##.#.###.124>#123, interface stats: received=0, sent=0, dropped=0, active_time=326240 secs
Mar 31 17:17:23 <RACNODE1> abrt-hook-ccpp: invalid number '<RACNODE1>'
$ORACLE_BASE\diag\crs\<RACNODE1>\crs\trace\alert.log
2019-03-31 17:17:15.641 [OCSSD(16708)]CRS-1610: Network communication with node <RACNODE3> (3) missing for 90% of timeout interval. Removal of this node from cluster in 2.610 seconds
2019-03-31 17:17:15.641 [OCSSD(16708)]CRS-1610: Network communication with node <RACNODE2> (2) missing for 90% of timeout interval. Removal of this node from cluster in 2.530 seconds
2019-03-31 17:17:15.641 [OCSSD(16708)]CRS-1610: Network communication with node <RACNODE4> (4) missing for 90% of timeout interval. Removal of this node from cluster in 2.300 seconds
2019-03-31 17:17:18.254 [OCSSD(16708)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in $ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/ocssd.trc.
2019-03-31 17:17:18.254 [OCSSD(16708)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in $ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/ocssd.trc
2019-03-31 17:17:18.388 [OCSSD(16708)]CRS-1652: Starting clean up of CRSD resources.
2019-03-31 17:17:18.770 [ORAAGENT(23120)]CRS-5016: Process "$GI_HOME/opmn/bin/onsctli" spawned by agent "ORAAGENT" for action "check" failed: details at "(:CLSN00010:)" in "$ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/crsd_oraagent_oracle.trc"
2019-03-31 17:17:19.015 [OCSSD(16708)]CRS-1608: This node was evicted by node 2, <RACNODE2>; details at (:CSSNM00005:) in $ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/ocssd.trc.
alert_<instance1>.log
Sun Mar 31 17:17:21 2019
Errors in file $ORACLE_BASE/diag/rdbms/<database>/<instance1>/trace/<instance1>_asmb_34283.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel
Changes
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |