Tuxedo 6.5: GWTDOMAIN Reporting LIBGW_CAT:5205 "maximum action table size reached" (Doc ID 767350.1)

Last updated on APRIL 12, 2012

Applies to:

Oracle Tuxedo - Version: 6.5 and later   [Release: and later ]
Information in this document applies to any platform.
Information in this document applies to any platform

Goal

The domain gateway process GWTDOMAIN has a bug that is causing it to reach the maximum number of actions (GW_ACTMAX =10000) as reported by message LIBGW_CAT:5205 and eventually message LIBGW_CAT:5207. This condition causes the gateway to shut itself down. This problem has occurred at two different customers causing a production outage in each case.

The problem occurs when a server side domain is shut down while transactions are in-flight. While the gateway is down the BBL attempts to send a TMS_TIMEOUT message for each outstanding transaction to the gateway every scanunit. This is evident from the LIBTUX_CAT:743 messages that are issued by the BBL when it can't find the TMS service advertised for the domain gateway group.

When the gateway comes back up the gateway attempts to process the TMS_TIMEOUT messages by creating a GWEV_TMS_TIMEOUT action for each TMS_TIMEOUT message. The GWEV_TMS_TIMEOUT action is changed to a GWEV_NW_ROLLBACK action by the gw_tms_timeout() routine. The code in gw_nw_sndrollback() adds a GWEV_NW_DONE event and then calls gw_nw_tms_event() to send a GWNW_XATMI_ROLLBACK message to the parent domain for the transaction. The code in gw_nw_tms_event() calls gw_nw_acall() which calls gw_chg_action() to change the GWEV_NW_ROLLBACK action to a GWEV_PRE_NW_ACALL1 action.

Unfortunately, this call to gw_chg_action() processes the event stack and allows the GWEV_NW_DONE event to be eligible for scheduling. The GWEV_NW_DONE event runs to completion and deletes the context stored in gwa_nwctx_idx. When the GWEV_PRE_NW_ACALL1 action finally runs the context has been deleted and the code in gw_nw_snd() fails with message LIBGWT_CAT:1027. The send is never done and the fd_state is never changed to GWNW_FD_CONNECTED. This causes all the subsequent GWEV_NW_ROLLBACK actions to be chained by the code in gw_nw_tms_event(). As the BBL continues to send TMS_TIMEOUT messages the gateway eventually reaches its maximum number of outstanding actions.

Solution

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms