My Oracle Support Banner

EM 13.4 RU14 / RU15 or EM 13.5 RU4 / RU5: OMS Restarts / Crashes / Hangs due to Deadlocks Caused by [deadlocked thread] GCLoader[data] (Doc ID 2853556.1)

Last updated on JANUARY 17, 2023

Applies to:

Enterprise Manager Base Platform - Version 13.4.0.0.0 and later
Information in this document applies to any platform.

Symptoms

After applying RU 14 or RU15 on EM 13.4 OR

RU4 or RU5 on EM 13.5, OMS crashes / restarts or hangs frequently.

Some of the observations:

$<EM_INSTANCE_HOME>/em/EMGC_OMS1/sysman/log/emctl.msg file may gets updated with the restart errors:

HealthMonitor Mar 4, 2022 3:29:15 AM OMS Heartbeat Recorder error: Heartbeat recorder timed out. OMS id=48
Critical error err=3 detected in module OMS Heartbeat Recorder
OMS will be restarted. A full thread dump will be generated in the log file
<EM_INSTANCE_HOME>/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out
to help Oracle Support analyse the problem.

Observed that below big size nextgen files getting generated under Central Agent running on OMS Servers:

-rw-r----- 1 <user> <group> 2329491144 Mar 22 14:47 nextgen_2022-03-22_13-46-41PM.log
-rw-r----- 1 <user> <group> 0 Mar 23 09:57 nextgen_2022-03-23_09-57-53AM.log.lck
-rw-r----- 1 <user> <group> 27158163 Mar 23 09:58 nextgen_2022-03-23_09-57-53AM.log
-rw-r----- 1 <user> <group> 0 Mar 23 11:08 nextgen_2022-03-23_11-08-11AM.log.lck
-rw-r----- 1 <user> <group> 3905656311 Mar 23 12:55 nextgen_2022-03-23_11-08-11AM.log
-rw-r----- 1 <user> <group> 0 Mar 23 15:15 nextgen_2022-03-23_15-15-18PM.log.lck
-rw-r----- 1 <user> <group> 1414261939 Mar 23 15:57 nextgen_2022-03-23_15-15-18PM.log
-rw-r----- 1 <user> <group> 0 Mar 24 09:33 nextgen_2022-03-24_09-33-29AM.log.lck
-rw-r----- 1 <user> <group> 1713305652 Mar 24 10:09 nextgen_2022-03-24_09-33-29AM.log


Observed below errors in <EM_INSTANCE_HOME>/em/EMGC_OMS1/sysman/log/oms_diag_info_2022_03_04_03_29_39.msg:

HealthMonitor Mar 4, 2022 3:29:15 AM OMS Heartbeat Recorder error: Heartbeat recorder timed out. OMS id=48
Critical error err=3 detected in module OMS Heartbeat Recorder
OMS will be restarted. A full thread dump will be generated in the log file
<EM_INSTANCE_HOME>/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out
to help Oracle Support analyse the problem.

HealthMonitor determined that below OMS Heartbeat Recorder thread caused restart

Errant Task Thread Dump below:

#EM_SYSTEM_POOL#:OMSHeartbeatThread prio=10 id=349 state=BLOCKED
java.util.logging.FileHandler.publish(FileHandler.java:698)
oracle.core.ojdl.logging.ODLLogger.doLog(ODLLogger.java:914)
oracle.core.ojdl.logging.ODLLogger.log(ODLLogger.java:694)
oracle.sysman.util.logging.JavaSdkLogOperations.logp(JavaSdkLogOperations.java:244)
oracle.sysman.util.logging.DualModeLogOperations.logp(DualModeLogOperations.java:271)
oracle.sysman.emSDK.util.logging.Logger.logp(Logger.java:1033)
oracle.sysman.emSDK.util.logging.Logger.log(Logger.java:829)
oracle.sysman.emSDK.util.logging.Logger.fine(Logger.java:1503)
oracle.sysman.core.common.workmanager.Work.handleScheduledWork(Work.java:426)
oracle.sysman.core.common.workmanager.Work.call(Work.java:284)
oracle.sysman.core.common.workmanager.Work.call(Work.java:50)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

Found one Java-level deadlock:
=============================
"Job Step 68447 Job Long-System Pool:JobWorker":
waiting to lock monitor 0x00007f902c515938 (object 0x00000000c426a5b0, a oracle.security.jps.service.credstore.CredentialAccessPermissionCollection),
which is held by "Job Step 68405 Job Long-System Pool:JobWorker"
"Job Step 68405 Job Long-System Pool:JobWorker":
waiting to lock monitor 0x00007f90080b0638 (object 0x00000000dda4fec8, a java.util.logging.FileHandler),
which is held by "Job Step 68447 Job Long-System Pool:JobWorker"


$<EM_INSTANCE_HOME>/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out reports the following errors: [This is the main error for this issue]

<Mar 4, 2022 3:29:56,891 AM CET> <Critical> <WebLogicServer> <BEA-000394> <Deadlock detected:
[deadlocked thread] #EM_SYSTEM_POOL#:oracle.sysman.core.common.ngtrace.fwk.BackgroundTraceAggregator:
----------------------------------------------------------------------------------------------------
Thread '#EM_SYSTEM_POOL#:oracle.sysman.core.common.ngtrace.fwk.BackgroundTraceAggregator' is waiting to acquire lock 'java.util.logging.FileHandler@687bc07b' that is held by thread 'GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/'

Stack trace:
------------
java.util.logging.FileHandler.publish(FileHandler.java:698)
oracle.core.ojdl.logging.ODLLogger.doLog(ODLLogger.java:914)
oracle.core.ojdl.logging.ODLLogger.log(ODLLogger.java:694)
oracle.sysman.util.logging.JavaSdkLogOperations.logp(JavaSdkLogOperations.java:244)
oracle.sysman.util.logging.DualModeLogOperations.logp(DualModeLogOperations.java:271)
oracle.sysman.emSDK.util.logging.Logger.logp(Logger.java:1033)
oracle.sysman.emSDK.util.logging.Logger.log(Logger.java:829)
oracle.sysman.emSDK.util.logging.Logger.fine(Logger.java:1503)
oracle.sysman.core.common.workmanager.Work.handleScheduledWork(Work.java:426)
oracle.sysman.core.common.workmanager.Work.call(Work.java:284)
oracle.sysman.core.common.workmanager.Work.call(Work.java:50)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)

[deadlocked thread] GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/:
------------------------------------------------------------------------
Thread 'GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/' is waiting to acquire lock 'oracle.security.jps.service.credstore.CredentialAccessPermissionCollection@6b44d6b7' that is held by thread 'GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/'

Stack trace:
------------
oracle.security.jps.service.credstore.CredentialAccessPermissionCollection.elements(CredentialAccessPermission.java:472)
oracle.security.jps.az.common.info.JpsPermissionsEnumerator.getNextEnumWithMore(JpsPermissions.java:603)
oracle.security.jps.az.common.info.JpsPermissionsEnumerator.hasMoreElements(JpsPermissions.java:580)
java.security.PermissionCollection.toString(PermissionCollection.java:183)
java.text.MessageFormat.subformat(MessageFormat.java:1280)
java.text.MessageFormat.format(MessageFormat.java:865)
java.text.Format.format(Format.java:157)
java.text.MessageFormat.format(MessageFormat.java:841)
java.util.logging.Formatter.formatMessage(Formatter.java:138)
com.oracle.cie.common.util.logging.DefaultFormatter.format(DefaultFormatter.java:78)

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.