My Oracle Support Banner

TIMESTEN: FailThreshold , The Replication Agent and Replication State (Doc ID 1562104.1)

Last updated on MARCH 12, 2021

Applies to:

Oracle TimesTen In-Memory Database - Version 7.0.1.0.0 and later
Information in this document applies to any platform.

Symptoms

An inherent problem in TimesTen replication is the fact that transactions can be executed and committed on a master node much more quickly than they can be replicated across a network and committed on subscriber nodes. In applications where there are very high levels of transaction processing, it is possible to accumulate  large backlogs of transactions which are committed on the master and not yet replicated to subscriber nodes. This data is stored in transaction log files which are retained until the data in them is successfully replicated to all subscriber nodes. In TimesTen Replication,  the concept of the FAILTHRESHOLD exists to  limit the degree of divergence  of master and subscriber nodes measured in terms of the amount of  accumulated log files containing data waiting to be replicated from master to subscribers. If FAILTHRESHOLD is defined, and if the number of accumulated log files exceeds the defined FAILTHRESHOLD than the subscriber node(s) are put into a replication FAIL state and replication of transactions to that node or nodes stops. Transaction processing on the master node continues but is not replicated to nodes in FAIL state. Any node which reaches FAIL state must be destroyed and recreated (cloned from the master node via "ttRepAdmin -duplicate") before replication to that node can be resumed.

It is possible to stop replication from master to subscriber nodes in several different ways, and the manner in which replication is stopped has different impacts upon the subscriber's position with respect to FAILTHRESHOLD. The purpose of this Note is to identify the impact on FAILTHRESHOLD of different way of stopping replication. Additionally, this Note seeks to clarify the difference between starting and stopping the replication agent and changing a node's replication state.     

Consider a master/subscriber configuration with FAILTHRESHOLD defined.  Then the following rules apply:

1) Replication Agent is stopped on the subscriber node: Transaction processing on the master continues and transactions are not replicated to the subscriber and transaction logs accumulate. When the FAILTHRESHOLD is exceeded, the subscriber node will be put into FAIL state. If the subscriber agent is restarted before FAILTHRESHOLD has been exceeded than the transaction backlog will be replicated to the subscriber and eventually the master and subscriber nodes should reach a point of complete synchronization.

 

2) Replication Agent is stopped on the master node:Transaction processing on the master node continues and transactions are not replicated to the subscriber and transaction logs accumulate. When the FAILTHRESHOLD is exceeded, transaction processing continues and the subscriber remains available for client connections. Immediately upon restart of the replication agent on the master node, the subscriber node(s)  will be set to FAIL and replication to the subscriber will stop. If the replication agent is restarted before FAILTHRESHOLD has been exceeded than the transaction backlog will be replicated to the subscriber and eventually the master and subscriber nodes should reach a point of complete synchronization.

 

3) Replication Agents started on both nodes; subscriber replication state PAUSED: exactly as in (1) above, transaction processing on the master node continues and transactions are not replicated to the subscriber. If the FAILTHRESHOLD is exceeded while the subscriber state is PAUSED, the subscriber node will be set to FAIL and replication to the subscriber will stop. If the subscriber state is reset to STARTED before FAILTHRESHOLD has been exceeded then the transaction backlog will be replicated to the subscriber and eventually the master and subscriber nodes should reach a point of complete synchronization. 

 

4) Replication Agents started on both nodes; subscriber replication state STOPPED: when subscriber replication is placed into a STOPPED state, FAILTHRESHOLD is suspended. Specifically, the Replication BookMark is dropped and processing continues as if replication between the master and subscriber nodes had not been configured. There is no accumulation of transaction logs waiting to be replicated to the subscriber node, and transaction logs are purged when the appropriate checkpointing has occurred. When the replication state is reset to STARTED, a new replication bookmark is established and replication of transactions resumes. However, backlogged transactions are not replicated to the subscriber. Consequently, replication continues between master and subscriber but there will be a set of transactions committed on the master node while the subscriber node was STOPPED which will never be replicated to the subscriber node and which will result in a permanent state of divergence between the master and subscriber nodes. 

 

Changes

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.