OID 10g ODS.ODS_CHG_LOG Table Is Very Large Or OID Performance Is Slow And/Or Consuming High CPU, Affecting Applications Logins
(Doc ID 1076521.1)
Last updated on SEPTEMBER 01, 2023
Applies to:
Oracle Internet Directory - Version 9.0.4 to 10.1.4 [Release 10gR1 to 10gR3]Information in this document applies to any platform.
Symptoms
Two Oracle Internet Directory (OID) nodes running version 10g 10.1.4 (for example 10.1.4.0.1 or 10.1.4.2.0).
The ODS_CHG_LOG tables are very large.
Or OID performance is poor, very slow or OID consumes high cpu usage, causing application logins problems such as having to retry multiple times to actually get logged in, etc.
Tried purging it by following Document 841787.1, and selected the change log purge collector, and also tried running oidstats.sql, but that did not help.
If using replication, the database drives on the Master (MDS) server can get to 99% and the Replica (RDS) server to 100%, and then have had to shut down both servers. Also experiencing high volume of archive logs being generated.
Initially replication was not working, which might have been related as the change requests are being stored in the change log tables and are not getting purged because they have not been sent to the other server. However, even after resolving the ASR replication problem, and remtool -asrverify shows no errors/problems and replication works, the ODS_CHG_LOG on the Replica continues to grow very large and garbage collection is not clearing it.
Was once able to reduce the size of the datafile containing the ods_chg_log and to restart both servers and get replication working in both directions. Able to get the garbage collection to run manually by running it through an ldif command, but it only seems to be able to clean a certain number of records even though there are still over 600k of records marked with retry_cnt = -2. But manual purge stopped after deleting about 100,000 records and the table started growing again.
As a workaround, finally had some success by reducing the garbage collector target age to 8 hours and have kept the ods_chg_log to around 99K records where retry count is -2.
From the logs, it seems to delete 159K records when it runs every 4 hours, which indicates it may have always been working, but cannot keep up with the number of transactions in a 24 hour period. But this node is a replica in a QA environment, and there aren't that many applications or testing running against it. Also this kind of traffic is not seen even on Production servers, so should not be happening on QA.
Where is all this traffic coming from, and how to stop it?
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Cause |
Solution |
References |