My Oracle Support Banner

Deadlock In Coherence Extend Client In Channel.closeInternal() (Doc ID 1425463.1)

Last updated on APRIL 28, 2020

Applies to:

Oracle Coherence - Version 3.6.1 to 3.7.1.2 [Release AS10g]
Information in this document applies to any platform.

Symptoms

You experience a problem in a Coherence*Extend client which hangs sporadically.   When you take a Java thread dump (refer to <Note:1268359.1> for details on how to collect a thread dump), a deadlock is reported in the output:

Found one Java-level deadlock:
=============================
"ExtendTcpClient-POP1:TcpInitiator":
  waiting to lock monitor 0x0xxxxxxxx (object 0x0xxxxxxx, a com.tangosol.coherence.component.net.extend.message.Request$Status),
  which is held by "DataSyncPoller-ADB031-device-cache)"
"xxx-xxx-device-cache)":
  waiting to lock monitor 0x0xxxxxxx (object 0x0xxxxxxxx, a com.tangosol.util.SparseArray),
  which is held by "ExtendTcpClient-POP1:TcpInitiator"

Java stack information for the threads listed above:
===================================================
"ExtendTcpClient-POP1:TcpInitiator":
    at com.tangosol.coherence.component.net.extend.message.Request$Status.cancel(Request.CDB:10)
    - waiting to lock <0x0xxxxxxxxxxx> (a com.tangosol.coherence.component.net.extend.message.Request$Status)
    at com.tangosol.coherence.component.net.extend.Channel.closeInternal(Channel.CDB:25)
    - locked <0x0xxxxxxxxxxxxx> (a com.tangosol.util.SparseArray)
    at com.tangosol.coherence.component.net.extend.Connection.closeInternal(Connection.CDB:24)
    - locked <0x0xxxxxxxxxxxxx> (a com.tangosol.util.SparseArray)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.peer.initiator.TcpInitiator$TcpConnection.closeInternal(TcpInitiator.CDB:3)
    at com.tangosol.coherence.component.net.extend.Connection.close(Connection.CDB:8)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Peer.checkPingTimeout(Peer.CDB:12)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.peer.Initiator.checkPingTimeouts(Initiator.CDB:6)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Peer.onNotify(Peer.CDB:108)
    at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
    at java.lang.Thread.run(Thread.java:662)
"xxx-xxx-device-cache)":
    at com.tangosol.coherence.component.net.extend.Channel.unregisterRequest(Channel.CDB:6)
    - waiting to lock <0x0xxxxxxxxxxxx> (a com.tangosol.util.SparseArray)
    at com.tangosol.coherence.component.net.extend.Channel.onRequestCompleted(Channel.CDB:1)
    at com.tangosol.coherence.component.net.extend.message.Request$Status.cancel(Request.CDB:27)
    at com.tangosol.coherence.component.net.extend.message.Request$Status.waitForResponse(Request.CDB:58)
    - locked <0x0xxxxxxxxxxxx> (a com.tangosol.coherence.component.net.extend.message.Request$Status)
    at com.tangosol.coherence.component.net.extend.Channel.request(Channel.CDB:20)
    at com.tangosol.coherence.component.net.extend.Channel.request(Channel.CDB:1)
    at com.tangosol.coherence.component.net.extend.RemoteNamedCache$BinaryCache.aggregate(RemoteNamedCache.CDB:12)
    at com.tangosol.util.ConverterCollections$ConverterInvocableMap.aggregate(ConverterCollections.java:2147)
    at com.tangosol.util.ConverterCollections$ConverterNamedCache.aggregate(ConverterCollections.java:2654)
    at com.tangosol.coherence.component.net.extend.RemoteNamedCache.aggregate(RemoteNamedCache.CDB:1)
    at com.tangosol.coherence.component.util.SafeNamedCache.aggregate(SafeNamedCache.CDB:1)
    at com.somecompany.gud.storage.coherence.remote.RemoteStorageSegmentReader.getUpdatesCount(RemoteStorageSegmentReader.java:103)
    - locked <0x0xxxxxxxxxxxx> (a com. somecompany .gud.storage.coherence.remote.RemoteStorageSegmentReader)
    at com. somecompany .gud.datasync.rsync.RemoteCacheDataLoader.split(RemoteCacheDataLoader.java:141)
    at com. somecompany .gud.datasync.rsync.RemoteCacheDataLoader.update(RemoteCacheDataLoader.java:92)
    at com. somecompany .gud.datasync.rsync.RsyncUpdatePoller.performUpdate(RsyncUpdatePoller.java:48)
    - locked <0x0xxxxxxxxxxxx> (a com. somecompany .gud.datasync.rsync.RsyncUpdatePoller)
    at com.somecompany.gud.datasync.SyncWorker$1.run(SyncWorker.java:31)
    at java.lang.Thread.run(Thread.java:662)

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.