Tuxedo JCA Adapter Failover Problem. Service Requests Timeout After Crash Of One Of The Tuxedo Remote Machines (Doc ID 1630125.1)

Last updated on DECEMBER 05, 2016

Applies to:

Tuxedo JCA Adapter - Version 11.1.1.2.1 and later
Information in this document applies to any platform.

Symptoms

Tuxedo JCA (on Websphere AS) service requests sent to a Tuxedo MP configured backup node(via backup node's GWTDOMAIN) get timeout exceptions after the master node is crashed. Active/Active configuration.

Master migration not started yet. The Tuxedo application service on the backup node executes successfully and does return a response one minute before the Tuxedo JCA client side timeout occurs.

TCP dump files show that the response from Tuxedo arrives at the Websphere side port in time but is not processed by Tuxedo JCA.

Tuxedo backup node ULOG with TMTRACE set:

102250.263.backupMPnode!myserver.6110.2943708928.0: TRACE:ia:    } tpcommit = 1
102250.264.backupMPnode!myserver.6110.2943708928.0: TRACE:at:  } tpreturn [long jump]
102250.264.backupMPnode!myserver.6110.2943708928.0: TRACE:at:} tpservice


Websphere Appserver (WAS) log with Tuxedo JCA debug tracing set(e.g. A02-00-trace.log):

[10/18/13 10:23:48:008 CEST] 00000093 SystemOut     O   10:23:48,007 ERROR
[InvocationContentHandler]
[errorException][exception=[exception=com.abcde.xas.TransportException:
ResourceException occured:TPCALL failed
TPException:TPETIME(13):0:0:TPED_MINVAL(0):QMNONE(0):0


A thread may hang:

[1/31/14 14:00:33:292 timezone] 00000048 TuxedoLogger  1 TJARouteService getTDomainSession > (rt = Route(Tuxedo_ABC_DEF#0#1/*: __sess_Tuxedo_ABC_DEF#0#1-0_1), import = com.oracle.tuxedo.adapter.config.DMImport@35963596, xid = null)
[1/31/14 14:01:00:881 timezone] 00000031 ThreadMonitor W   WSVR0605W: Thread "WebContainer : 3" (00000046) has been active for 715414 milliseconds and may be hung.
There is/are 1 thread(s) in total in the server that may be hung. at com.oracle.tuxedo.adapter.config.DMSession.getTsession(DMSession.java:865)

  

OR a port may stay in a CLOSE_WAIT state with a thread getting deadlocked:

[12/20/13 14:43:25:732 timezon] 00000003 ThreadMonitor W   WSVR0605W: Thread "ORB.thread.pool : 0" (0000006b) has been active for 662408 milliseconds and may be hung.
  There is/are 1 thread(s) in total in the server that may be hung.
     at sun.misc.Unsafe.park(Native Method)
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:171)
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
     at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:900)
     at com.oracle.tuxedo.adapter.TimerEventManager.unregisterKAEventHandler(TimerEventManager.java:130)

  


Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms