Tuxedo 9.0: With Oracle RAC: TMS Can Loop Printing LIBTUX_CAT:1380 Under Certain Conditions

(Doc ID 777021.1)

Last updated on FEBRUARY 08, 2017

Applies to:

Oracle Tuxedo - Version: 9.0 to 9.0
Information in this document applies to any platform.

Goal

With an Oracle vendor fix, there can be a LIBTUX_CAT:1380 message loop under certain conditions as described below:

--- 10-25-05 19:34 ----------------

Generated the dynamic library with the following steps

1) replace the xao.o in $ORACLE_HOME/lib32/libclntsh.a(ar -r libclntsh.a xao.o )
2) make sure nobody use the 10g client
3) cd $ORACLE_HOME/lib32; mv libclntsh.so.10 libclntsh.so.10.bak
4) genclntsh -32; get the new libclntsh.so.10
5) rebuild my application


a) After running cases, There's no core file found. But get the -3114 error after shudown ORCLM1 and re-execute xa_recover

[oracle@boxname tmp]$ oerr ora 3114
03114, 00000, "not connected to ORACLE"
// *Cause:
// *Action:

b) On the non-master machine get the following error message, and the ULOG will be full of the 1380 error message.
085334.boxname!TMS_ORA.13141.1.0: TRACE:xa: } xa_recover = 2
085404.boxname!TMS_ORA.13137.1.0: TRACE:xa: { xa_recover(0xffbfb084, 100, 0, 0x1000000)
085404.boxname!TMS_ORA.13137.1.0: TRACE:xa: } xa_recover = 2
085404.boxname!TMS_ORA.13141.1.0: TRACE:xa: { xa_recover(0xffbfb084, 100, 0, 0x1000000)
085404.boxname!TMS_ORA.13141.1.0: TRACE:xa: } xa_recover = 2
085404.boxname!TMS_ORA.13141.1.0: gtrid x0 x4347675e x6d: LIBTUX_CAT:1380: ERROR: Still in transaction
085404.boxname!TMS_ORA.13141.1.0: gtrid x0 x4347675e x6d: LIBTUX_CAT:1380: ERROR: Still in transaction
085404.boxname!TMS_ORA.13141.1.0: gtrid x0 x4347675e x6d: LIBTUX_CAT:1380: ERROR: Still in transaction
085404.boxname!TMS_ORA.13141.1.0: gtrid x0 x4347675e x6d: LIBTUX_CAT:1380: ERROR: Still in transaction


--- 10-26-05 18:11 ----------------

When Oracle fails over, xa_end() in the application server returns XAER_PROTO rather than XAER_RMFAIL, which means that no code is executed to try to reopen the database.

Oracle error codes of ORA-1002 ("fetch out of sequence") and ORA-3114 ("not connected to Oracle") are then returned on each subsequent service request.

There was a "database disconnection" thread in the tuxedo.general newsgroup a few days ago on this topic and Wayne Chen suggested calling tpopen() again in the application server in this situation.
You may want to try this solution out and see whether or not it helps.

The "LIBTUX_CAT:1380: ERROR: Still in transaction" message loop that you experienced is a bug in Tuxedo.
It will happen only when TMS_rac_recover is done, when XARETRYINTERVAL is also set, and when transactions are returned by xa_recover() and are not resolved within XARETRYDURATIONSECONDS.
The problem is more likely to show up with a short XARETRYDURATIONSECONDS value and a long BB scan interval.


Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms