RP/TUX 6.4, 7.1 - 'tmboot -B xxx -l xxx' fails to restore all connections to a partitioned node - CR020965 (Doc ID 766213.1)

Last updated on NOVEMBER 04, 2016

Applies to:

Oracle Tuxedo / Tuxedo / 6.4, 7.1
Information in this document applies to any platform

Goal

DESCRIPTION:

On Dynix Sequence 4.4.2 platform with Tuxedo v6.4 with patch#170 or above (Note: problem won't exist without any
patches)
In a MP mode after a application node is partitioned, the 'tmboot -B xxx -l xxx' fails to restore all connections to a
partitioned node.  And hence the services restored on the previous partitioned node is not available to other
application nodes after 'tmboot -B xxx -l xxx' on the master node.  The tpcall to the service restored on that node
will return TPEOS.

There is testcase available on lcseq1, lcseq2, lcseq3 machines in directory: /nfs/home2/qchan/lcseqX/153527.
Here are the steps to replicate the problem:
lcseq1 (simple) machine : as master 
lcseq2 (simple1) machine : This is not as backup master (if it is a backup master, the problem won't occurs!), but as
application node with WSL. 
lcseq3 (simple2) machine : as the application node hosting the TOUPPERx service.
After booting the tuxedo system in the /nfs/home2/qchan/lcseq1/153527, 
1. remove ipc resource (using ipcrm or ipclean) in simple2. 
2. do pcl(pclean) command of tmadmin in simple. ( pcl simple2) 
3. reboot tuxedo application of simple2  in simple.(tmboot -B simple2 -l simple2) 
4. client on simple1 (ie, running 'simprclient TOUPPERx abcdef') connects and requests service in simple2. 
    You can find that client receives TPEOS error from Tuxedo system. 
(Use the following step 5 or use the run_status script on simple1 to continuously monitor the connections on
simple/simple1/simple2 machines)
5. Pls. do pnw(print network) command of tmadmin in simple1. 
   You may be able not to find simple2 machine. 
   If you do rec(reconnect) command  of tmadmin in simple1, you can see simple2 . 

If you uninstall the Tuxedo v6.4 patch#170 (or above), (ie, use plain Tuxedo v6.4), the above step 4 will work OK and
you will see all the connections to the partitioned node (simple2) are restored properly by 'tmboot -B xxx -l xxx'.

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms