TimesTen Replication: How To Recover From a FAIL THRESHOLD EXCEEDED State
(Doc ID 1462007.1)
Last updated on JANUARY 30, 2022
Applies to:Oracle TimesTen In-Memory Database - Version 6.0.0 to 18.104.22.168.2 [Release 6.0 to 11.2]
Information in this document applies to any platform.
A fundamental problem in TimesTen transaction processing is that the TimesTen DBMS can commit transactions much faster than it can replicate those transactions across a network. In high-OLTP environments where replication has been enabled, this can result in the following:
(1) The accumulation of huge numbers of transaction log files which cannot be purged until the transactions in them have been replicated to the other node(s) in the replication schema. If these accumulated log files fill the designated disk device, preventing more log data from being written, it can result in invalidation of the data store.
(2) A logical application problem can arise in which the state of the data store at the subscriber node(s) is far behind the state at the primary node such that queries against the subscriber node(s) return results that are unacceptable with respect to application requirements.
For this reason, TimesTen replication allows for the designation of a failure threshold, "FAILTHRESHOLD", for each subscriber node in the replication scheme. The FAILTHRESHOLD defines the maximum possible "commit gap" between primary and subscriber nodes. It is a number which defines the maximum number of unreplicated log files which can accumulate on the primary node before the subscriber node is too far behind the primary. When the number of log files exceeds the threshold number than replication to the subscriber is halted and the subscriber is said to be in a FAILed state. The primary node does not stop transaction processing and basic application operations are not blocked. Replication to other subscriber nodes which do not exceed their FAILTHRESHOLD also continues normally. Only the node(s) which have exceeded FAILTHRESHOLD stop receiving transactions from the primary node.
This Note describes how to recover replication nodes from FAILed state: it provides instructions for different types of replication schemes. Persons reading this Note are advised to read the following Notes first as these describe fail threshold dynamics and internals in detail:
833743.1 "What is the impact on the replication when FAILTHRESHOLD is reached?"
786982.1 "Understand TimesTen LSNs, Bookmarks and Replication"
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document