EM 13.4 RU14 / RU15 or EM 13.5 RU4 / RU5: OMS Restarts / Crashes / Hangs due to Deadlocks Caused by [deadlocked thread] GCLoader[data]
(Doc ID 2853556.1)
Last updated on JANUARY 17, 2023
Applies to:
Enterprise Manager Base Platform - Version 13.4.0.0.0 and laterInformation in this document applies to any platform.
Symptoms
After applying RU 14 or RU15 on EM 13.4 OR
RU4 or RU5 on EM 13.5, OMS crashes / restarts or hangs frequently.
Some of the observations:
$<EM_INSTANCE_HOME>/em/EMGC_OMS1/sysman/log/emctl.msg file may gets updated with the restart errors:
Critical error err=3 detected in module OMS Heartbeat Recorder
OMS will be restarted. A full thread dump will be generated in the log file
<EM_INSTANCE_HOME>/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out
to help Oracle Support analyse the problem.
Observed that below big size nextgen files getting generated under Central Agent running on OMS Servers:
-rw-r----- 1 <user> <group> 0 Mar 23 09:57 nextgen_2022-03-23_09-57-53AM.log.lck
-rw-r----- 1 <user> <group> 27158163 Mar 23 09:58 nextgen_2022-03-23_09-57-53AM.log
-rw-r----- 1 <user> <group> 0 Mar 23 11:08 nextgen_2022-03-23_11-08-11AM.log.lck
-rw-r----- 1 <user> <group> 3905656311 Mar 23 12:55 nextgen_2022-03-23_11-08-11AM.log
-rw-r----- 1 <user> <group> 0 Mar 23 15:15 nextgen_2022-03-23_15-15-18PM.log.lck
-rw-r----- 1 <user> <group> 1414261939 Mar 23 15:57 nextgen_2022-03-23_15-15-18PM.log
-rw-r----- 1 <user> <group> 0 Mar 24 09:33 nextgen_2022-03-24_09-33-29AM.log.lck
-rw-r----- 1 <user> <group> 1713305652 Mar 24 10:09 nextgen_2022-03-24_09-33-29AM.log
Observed below errors in <EM_INSTANCE_HOME>/em/EMGC_OMS1/sysman/log/oms_diag_info_2022_03_04_03_29_39.msg:
HealthMonitor Mar 4, 2022 3:29:15 AM OMS Heartbeat Recorder error: Heartbeat recorder timed out. OMS id=48
Critical error err=3 detected in module OMS Heartbeat Recorder
OMS will be restarted. A full thread dump will be generated in the log file
<EM_INSTANCE_HOME>/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out
to help Oracle Support analyse the problem.
HealthMonitor determined that below OMS Heartbeat Recorder thread caused restart
Errant Task Thread Dump below:
#EM_SYSTEM_POOL#:OMSHeartbeatThread prio=10 id=349 state=BLOCKED
java.util.logging.FileHandler.publish(FileHandler.java:698)
oracle.core.ojdl.logging.ODLLogger.doLog(ODLLogger.java:914)
oracle.core.ojdl.logging.ODLLogger.log(ODLLogger.java:694)
oracle.sysman.util.logging.JavaSdkLogOperations.logp(JavaSdkLogOperations.java:244)
oracle.sysman.util.logging.DualModeLogOperations.logp(DualModeLogOperations.java:271)
oracle.sysman.emSDK.util.logging.Logger.logp(Logger.java:1033)
oracle.sysman.emSDK.util.logging.Logger.log(Logger.java:829)
oracle.sysman.emSDK.util.logging.Logger.fine(Logger.java:1503)
oracle.sysman.core.common.workmanager.Work.handleScheduledWork(Work.java:426)
oracle.sysman.core.common.workmanager.Work.call(Work.java:284)
oracle.sysman.core.common.workmanager.Work.call(Work.java:50)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
Found one Java-level deadlock:
=============================
"Job Step 68447 Job Long-System Pool:JobWorker":
waiting to lock monitor 0x00007f902c515938 (object 0x00000000c426a5b0, a oracle.security.jps.service.credstore.CredentialAccessPermissionCollection),
which is held by "Job Step 68405 Job Long-System Pool:JobWorker"
"Job Step 68405 Job Long-System Pool:JobWorker":
waiting to lock monitor 0x00007f90080b0638 (object 0x00000000dda4fec8, a java.util.logging.FileHandler),
which is held by "Job Step 68447 Job Long-System Pool:JobWorker"
$<EM_INSTANCE_HOME>/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/EMGC_OMS1.out reports the following errors: [This is the main error for this issue]
<Mar 4, 2022 3:29:56,891 AM CET> <Critical> <WebLogicServer> <BEA-000394> <Deadlock detected:
[deadlocked thread] #EM_SYSTEM_POOL#:oracle.sysman.core.common.ngtrace.fwk.BackgroundTraceAggregator:
----------------------------------------------------------------------------------------------------
Thread '#EM_SYSTEM_POOL#:oracle.sysman.core.common.ngtrace.fwk.BackgroundTraceAggregator' is waiting to acquire lock 'java.util.logging.FileHandler@687bc07b' that is held by thread 'GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/'
Stack trace:
------------
java.util.logging.FileHandler.publish(FileHandler.java:698)
oracle.core.ojdl.logging.ODLLogger.doLog(ODLLogger.java:914)
oracle.core.ojdl.logging.ODLLogger.log(ODLLogger.java:694)
oracle.sysman.util.logging.JavaSdkLogOperations.logp(JavaSdkLogOperations.java:244)
oracle.sysman.util.logging.DualModeLogOperations.logp(DualModeLogOperations.java:271)
oracle.sysman.emSDK.util.logging.Logger.logp(Logger.java:1033)
oracle.sysman.emSDK.util.logging.Logger.log(Logger.java:829)
oracle.sysman.emSDK.util.logging.Logger.fine(Logger.java:1503)
oracle.sysman.core.common.workmanager.Work.handleScheduledWork(Work.java:426)
oracle.sysman.core.common.workmanager.Work.call(Work.java:284)
oracle.sysman.core.common.workmanager.Work.call(Work.java:50)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
[deadlocked thread] GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/:
------------------------------------------------------------------------
Thread 'GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/' is waiting to acquire lock 'oracle.security.jps.service.credstore.CredentialAccessPermissionCollection@6b44d6b7' that is held by thread 'GCLoader[data] - https://<HOSTANME>:<AGENT_PORT>/emd/main/'
Stack trace:
------------
oracle.security.jps.service.credstore.CredentialAccessPermissionCollection.elements(CredentialAccessPermission.java:472)
oracle.security.jps.az.common.info.JpsPermissionsEnumerator.getNextEnumWithMore(JpsPermissions.java:603)
oracle.security.jps.az.common.info.JpsPermissionsEnumerator.hasMoreElements(JpsPermissions.java:580)
java.security.PermissionCollection.toString(PermissionCollection.java:183)
java.text.MessageFormat.subformat(MessageFormat.java:1280)
java.text.MessageFormat.format(MessageFormat.java:865)
java.text.Format.format(Format.java:157)
java.text.MessageFormat.format(MessageFormat.java:841)
java.util.logging.Formatter.formatMessage(Formatter.java:138)
com.oracle.cie.common.util.logging.DefaultFormatter.format(DefaultFormatter.java:78)
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |