Problem With WS-RM In WebLogic Cluster With Load Balancing for proper CreateSequence And TerminateSequence Operation

(Doc ID 2298725.1)

Last updated on AUGUST 25, 2017

Applies to:

Oracle WebLogic Server - Version 10.3.6 to 10.3.6
Information in this document applies to any platform.

Symptoms

For environment which use WS-RM (WebService Reliable Messaging) for CreateSequence and TerminateSequence, the request comes through Load balancer and in a Load balancing environment can land to any managed servers.

For example, in an OSB cluster with two managed servers, osb_server1 and osb_server2, and a load balancer in front distributing messages in a simple round robin pattern.

There are web services deployed which are using WS-ReliableMessaging where the protocol usually results in the following message exchanges:

1. The client sends a CreateSequence to OSB and receives a CreateSequenceResponse back with a sequence ID.

2. The client sends the actual message under the same sequence ID and OSB invokes the Proxy Service and processes the request.

3. The client sends a TerminateSequence to OSB and OSB cleans up the resources used in this conversation.

OSB can correctly handle the scenario where the CreateSequence (1) is received in one server but the actual message (2) is received in another - it will forward it to the correct server internally. However, if the TerminateSequence message (3) is received in another server, it returns a soap fault "wsrm:UnknownSequence" back and the sequence is not terminated.

This issue occurs whenever the TerminateSequence is not received by the server which has processed it. In a non-load bancing environment it works fine because the CreateSequence and TerminateSequence calls go to to the same server. 

STEPS
-----------------------
Can be easily reproduced by sending messages manually, e.g. using soapUI:
1. StartSequence to osb_server1 - will respond the sequence ID back
2. Message with sequence ID to osb_server2 - will internally be routed to osb_server1 and processed
3. TerminateSequence with ID to osb_server2 - will result in a soap fault
4. TerminateSequence with ID to osb_server1 - will be accepted



As per http://docs.oracle.com/cd/E11035_01/wls100/webserv_adv/rm.html 
we have to :: 
" If you are using the Web Service reliable messaging feature in a cluster, you must: 
-Still create a local JMS queue, rather than a distributed queue, when creating the JMS queue in step 4 in LongRunningReliability.xml WS-Policy File. 
-Explicitly target this JMS queue to each server in the cluster. 
-Explicitly target the SAF agent to the cluster. 


However, this STEP HAS NOT BEEN VERIFIED. 

Reason, all the JMS sever and SAF agents are created by default when OSB is installed. No new JMS configuration is done for WS-reliable messaging. 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms