Coherence Cluster Performance Problem Introduced After Migration To G1 GC Algorithm

(Doc ID 2313384.1)

Last updated on OCTOBER 11, 2017

Applies to:

Oracle Coherence - Version 12.1.2.0.0 and later
Information in this document applies to any platform.

Symptoms

Performance problem introduced after migration to G1 GC algorithm in the 12c coherence data grid. We are seeing extended drops in throughput during our coherence cache reload process, which works by invoking read through for large numbers of keys. The process works for some hours at good throughput, then throughput falls off a cliff for an extended period before recovering to normal levels. We have captured a number of thread dumps that show the worker thread stack traces at in the low throughput state. Many of them are blocked while calling com.tangosol.util.SimpleMapIndex.insertInternal, and on every occasion, one thread holds the lock while in a call to com.tangosol.util.SegmentedHashMap.grow(SegmentedHashMap.java), and from com.tangosol.util.SimpleMapIndex.insertInternal.

It appears to us that we have encountered some condition where increasing the size of the SegementedHashMap is taking an excessive time, slowing the processing in that thread whilst blocking all others. This does not occur when running the identical load process under CMS. Thread dumps taken during normal processing typically show worker threads blocked waiting for the database, indicating that database latency is the limiting factor under normal circumstances. We cannot proceed with release of G1 into our production environment until the cause of this behavior is understood and a solution found to prevent.

Changes

JVM Options are in the use as follows from CMS to G1GC:

JVM Version: Java HotSpot(TM) 64-Bit Server VM (25.121-b36) for linux-amd64
JRE (1.8.0_121-b36), built on Mar 21 2017 23:26:03 by "java_re" with gcc
4.3.0 20080428 (Red Hat 4.3.0-8)

-XX:OnOutOfMemoryError=kill -9 %p
-XX:InitiatingHeapOccupancyPercent=75
-XX:+PrintReferenceGC
-XX:+PrintAdaptiveSizePolicy
-XX:+UnlockCommercialFeatures
-XX:+FlightRecorder
-Xmx24g
-Xms24g
-XX:+UseG1GC
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=<path>
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=3
-XX:GCLogFileSize=10M
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:-UseLoopPredicate
-Xloggc:<filename>

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms