My Oracle Support Banner

Storage Enabled Member Died While Cluster Rebalancing In a Coherence Cluster (Doc ID 2206956.1)

Last updated on DECEMBER 13, 2017

Applies to:

Oracle Coherence - Version 12.2.1.0.0 and later
Information in this document applies to any platform.

Symptoms

In an Oracle Coherence 12.2.1.x.x cluster storage enabled member died while cluster re-balancing. In a cluster when there was a problem member(s)/node(s) got left the cluster or there was a need to stop/kill a member(s)/node(s) in the cluster, after loosing members in the cluster partition re-balancing/redistribution expected to happen and shall not cause any bad repercussions in the cluster but coherence cache pruning happened during that time resulted in a service termination for existing members in the cluster.

Here is an example how the service thread get stuck due to this issue.

DistributedCache:DSLIdStoreCache" id=1757 State:RUNNABLE
at java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at com.tangosol.net.cache.LocalCache.getKeyMask(LocalCache.java:525)
at com.tangosol.net.cache.LocalCache.removeEvicted(LocalCache.java:544)
- locked com.tangosol.net.cache.LocalCache@5c79192c
at com.tangosol.net.cache.OldCache.pruneIncremental(OldCache.java:1887)
at com.tangosol.net.cache.OldCache.prune(OldCache.java:1825)
- locked com.tangosol.net.cache.LocalCache@5c79192c
at com.tangosol.net.cache.OldCache.put(OldCache.java:265)
- locked com.tangosol.net.cache.LocalCache@5c79192c
at com.tangosol.net.cache.ReadWriteBackingMap.putToInternalMap(ReadWriteBackingMap.java:908)
at com.tangosol.net.cache.ReadWriteBackingMap.putInternal(ReadWriteBackingMap.java:1488)
at com.tangosol.net.cache.ReadWriteBackingMap.put(ReadWriteBackingMap.java:764)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putPrimaryResource(PartitionedCache.CDB:70)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putPrimaryResource(PartitionedCache.CDB:1)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.insertPrimaryData(PartitionedCache.CDB:62)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.moveResourcesToPrimary(PartitionedCache.CDB:46)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.movePartition(PartitionedCache.CDB:39)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.movePartition(PartitionedCache.CDB:17)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.assignPrimaryPartition(PartitionedService.CDB:39)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.restoreOrphans(PartitionedService.CDB:79)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onOwnershipRequest(PartitionedService.CDB:78)

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.