Storage Nodes Restart Unexpectedly (Doc ID 1207141.1)

Last updated on FEBRUARY 24, 2017

Applies to:

Oracle Coherence - Version 3.5.2 and later
Information in this document applies to any platform.
***Checked for relevance on 19-MAR-2013***

Symptoms

Some storage nodes are being restarted without obvious valid reasons:

2010-02-22 12:38:46.060/176704.828 Oracle Coherence GE 3.5.3/465 <Warning> (thread=PacketPublisher, member=81): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=11, Timestamp=2010-02-20 11:33:42.197, Address=192.168.1.244:8098, MachineId=1589, Location=machine:lonrs05877,process:7,member:lonrs05877-7, Role=STORAGE)
by MemberSet(Size=2, BitSetCount=6 Member(Id=124, Timestamp=2010-02-20 11:33:45.682, Address=192.168.1.232:8091, MachineId=61502, Location=machine:lonrs05919,process:17062,member:lonrs05919-17062, Role=TCP-EXTEND) Member(Id=129, Timestamp=2010-02-20 11:33:45.848, Address=192.168.1.232:8097, MachineId=61502, Location=machine:lonrs05919,process:5,member:lonrs05919-5, Role=STORAGE))

2010-02-22 12:38:46.132/176704.900 Oracle Coherence GE 3.5.3/465 <Error> (thread=Cluster, member=81): This node appears to have partially lost the connectivity: it receives responses from MemberSet(Size=2, BitSetCount=5, ids=[124, 129]) which communicate with Member(Id=11, Timestamp=2010-02-20 11:33:42.197, Address=192.168.1.244:8098, MachineId=1589, Location=machine:lonrs05877,process:7,member:lonrs05877-7, Role=STORAGE), but is not responding directly to this member; that could mean that either requests are not coming out or responses are not coming in; stopping cluster service.

2010-02-22 12:38:46.133/176704.901 Oracle Coherence GE 3.5.3/465 <D5> (thread=Cluster, member=81): Service Cluster left the cluster

2010-02-22 12:38:46.333/176705.101 Oracle Coherence GE 3.5.3/465 <D5> (thread=Invocation:Management, member=81): Service Management left the cluster

2010-02-22 12:38:46.334/176705.102 Oracle Coherence GE 3.5.3/465 <D5> (thread=Proxy:ExendTcpProxyService, member=81): Service ExendTcpProxyService left the cluster

2010-02-22 12:38:46.335/176705.103 Oracle Coherence GE 3.5.3/465 <Error> (thread=DistributedCache, member=81): validatePolls: This service timed-out due to unanswered handshake request. Manual intervention is required to stop
the members that have not responded to this Poll
{
PollId=15483694, active
InitTimeMillis=1266842298052
Service=DistributedCache (2)
RespondedMemberSet=[]
LeftMemberSet=[]
RemainingMemberSet=[11]
}
Request=Message "GetRequest"
{
FromMember=Member(Id=81, Timestamp=2010-02-20 11:33:43.412,
Address=192.168.1.227:8094, MachineId=50489,
Location=machine:lonrs05914,process:17062,member:lonrs05914-17062,
Role=TCP-EXTEND)
FromMessageId=15660736
Internal=false
MessagePartCount=1
PendingCount=1
MessageType=6
ToPollId=0
Poll=null
Packets
{

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms