My Oracle Support Banner

Coherence Cluster Performance Problem Introduced After Migration To G1 GC Algorithm (Doc ID 2313384.1)

Last updated on AUGUST 22, 2023

Applies to:

Oracle Coherence - Version 12.1.2.0.0 and later
Information in this document applies to any platform.

Symptoms

Performance problem introduced after migration to G1 GC algorithm in the 12c coherence data grid. We are seeing extended drops in throughput during our coherence cache reload process, which works by invoking read through for large numbers of keys. The process works for some hours at good throughput, then throughput falls off a cliff for an extended period before recovering to normal levels. We have captured a number of thread dumps that show the worker thread stack traces at in the low throughput state. Many of them are blocked while calling com.tangosol.util.SimpleMapIndex.insertInternal, and on every occasion, one thread holds the lock while in a call to com.tangosol.util.SegmentedHashMap.grow(SegmentedHashMap.java), and from com.tangosol.util.SimpleMapIndex.insertInternal.

It appears to us that we have encountered some condition where increasing the size of the SegementedHashMap is taking an excessive time, slowing the processing in that thread whilst blocking all others. This does not occur when running the identical load process under CMS. Thread dumps taken during normal processing typically show worker threads blocked waiting for the database, indicating that database latency is the limiting factor under normal circumstances. We cannot proceed with release of G1 into our production environment until the cause of this behavior is understood and a solution found to prevent.

Changes

JVM Options are in the use as follows from CMS to G1GC:

JVM Version: Java HotSpot(TM) 64-Bit Server VM (25.121-b36) for linux-amd64
JRE (1.8.0_121-b36), built on Mar 21 2017 23:26:03 by "java_re" with gcc
4.3.0 20080428 (Red Hat 4.3.0-8)

-XX:OnOutOfMemoryError=kill -9 %p
-XX:InitiatingHeapOccupancyPercent=75
-XX:+PrintReferenceGC
-XX:+PrintAdaptiveSizePolicy
-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder
-Xmx24g
-Xms24g
-XX:+UseG1GC
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=<path>
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=3
-XX:GCLogFileSize=10M
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:-UseLoopPredicate
-Xloggc:<filename>

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.