My Oracle Support Banner

RAC: GC BLOCKS LOST continues to increase when NIC configured with load-balancing (Doc ID 2541818.1)

Last updated on APRIL 17, 2023

Applies to:

Oracle Database - Enterprise Edition - Version 12.1.0.2 and later
Information in this document applies to any platform.

Symptoms

GC BLOCKS LOST continues to increase, but network diagnostics show no errors.

INSTANCE GC BLOCKS LOST GC CUR BLOCKS SERVED GC CR BLOCKS SERVED RATIO
---------- -------------- -------------------- ------------------- ----------
  1 628053 6375554 5792311 .051615711
  2 737706 13895256 6086116 .036919687
  3 1105 852349 556890 .000784111
  4 24277 11589436 7694720 .001258909

 

AWR - "gc blocks lost" (Statistic) totals 20,411 and "gc cr block lost" in Top Timed Events with Avg Wait Time of 139 ms.

And caused node eviction: 

\var\log\messages

Mar 31 17:16:49 <RACNODE1> kernel: pcieport 0000:80:02.0: <Interface1>: Link is down
Mar 31 17:16:49 <RACNODE1> kernel: bonding: bond1: link status definitely down for interface <Interface1>, disabling it
Mar 31 17:16:49 <RACNODE1> kernel: bonding: bond1: now running without any active interface !
Mar 31 17:17:08 <RACNODE1> init.tfa: Checking/Starting TFA..
Mar 31 17:17:20 <RACNODE1> ntpd[9344]: Deleting interface #11 bond0:3, <IP=##.#.###.123>#123, interface stats: received=0, sent=0, dropped=0, active_time=323042 secs
Mar 31 17:17:20 <RACNODE1> ntpd[9344]: Deleting interface #9 bond0:1, <IP=##.#.###.124>#123, interface stats: received=0, sent=0, dropped=0, active_time=326240 secs
Mar 31 17:17:23 <RACNODE1> abrt-hook-ccpp: invalid number '<RACNODE1>'

$ORACLE_BASE\diag\crs\<RACNODE1>\crs\trace\alert.log

2019-03-31 17:17:15.641 [OCSSD(16708)]CRS-1610: Network communication with node <RACNODE3> (3) missing for 90% of timeout interval. Removal of this node from cluster in 2.610 seconds
2019-03-31 17:17:15.641 [OCSSD(16708)]CRS-1610: Network communication with node <RACNODE2> (2) missing for 90% of timeout interval. Removal of this node from cluster in 2.530 seconds
2019-03-31 17:17:15.641 [OCSSD(16708)]CRS-1610: Network communication with node <RACNODE4> (4) missing for 90% of timeout interval. Removal of this node from cluster in 2.300 seconds
2019-03-31 17:17:18.254 [OCSSD(16708)]CRS-1609: This node is unable to communicate with other nodes in the cluster and is going down to preserve cluster integrity; details at (:CSSNM00008:) in $ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/ocssd.trc.
2019-03-31 17:17:18.254 [OCSSD(16708)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in $ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/ocssd.trc
2019-03-31 17:17:18.388 [OCSSD(16708)]CRS-1652: Starting clean up of CRSD resources.
2019-03-31 17:17:18.770 [ORAAGENT(23120)]CRS-5016: Process "$GI_HOME/opmn/bin/onsctli" spawned by agent "ORAAGENT" for action "check" failed: details at "(:CLSN00010:)" in "$ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/crsd_oraagent_oracle.trc"
2019-03-31 17:17:19.015 [OCSSD(16708)]CRS-1608: This node was evicted by node 2, <RACNODE2>; details at (:CSSNM00005:) in $ORACLE_BASE/diag/crs/<RACNODE1>/crs/trace/ocssd.trc.

alert_<instance1>.log

Sun Mar 31 17:17:21 2019
Errors in file $ORACLE_BASE/diag/rdbms/<database>/<instance1>/trace/<instance1>_asmb_34283.trc:
ORA-15064: communication failure with ASM instance
ORA-03113: end-of-file on communication channel

 

 

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.