After Removing from DRG and Reinstalling an OID 11g Replica Node (Due to failed ACL Change Replication, which Locked Out All Users with LDAP: error code 50 - Insufficient Access Rights), Replication Log Shows: No replication agreement to process (Doc ID 1910453.1)

Last updated on JUNE 30, 2017

Applies to:

Oracle Internet Directory - Version 11.1.1 and later
Information in this document applies to any platform.

Symptoms

Four-node farm of Oracle Internet Directory (OID) 11g with Multimaster Replication:
Primary: OID1 (oid1.mycompany.com)
Secondary: OID2 (oid2.mycompany.com)
Secondary: OID3 (oid3.mycompany.com)
Secondary: OID4 (oid4.mycompany.com)

A change to the access control parameters on the primary node (OID1) replicated correctly but only to two of the three other nodes (OID2 and OID4), but partially failed on node OID3, causing all users to be locked out, including cn=orcladmin.  There appears to have been a corruption of the access control information.

Now on OID3, unable to perform any operations with the cn=orcladmin superuser account, whether logging in through Oracle Directory Services Manager (ODSM) or running ldap utilities from the command line.  For example, a login attempt to ODSM returns the error:

Could not load schema from OID server. Closed the connection. Search Failed. Host='oid3.mycompany.com' Details: [LDAP: error code 50 - Insufficient Access Rights]

This error is occurring for all users on this one node.

All troubleshooting suggestions from MOS Notes so far require a login with cn=orcladmin so have not been helpful.

 

Since it was evident that cn=orcladmin had lost all privileges, including to the root of the OID server, removed the node from the Directory Replication Group (DRG) and reinstalled it from scratch.  Now however, attempts to re-add the node to the DRG fail, i.e.:

1) Ran remtool -pverify on all four nodes and ensured that all checks were successful.
2) Ran remtool -paddnode to add the reinstalled node to the DRG (type 3, multimaster).
3) Set the orclreplicastate on oid3 to 0 using ldapmodify for replication bootstrap.
4) Started the replication service on oid3 with oidctl.

The oidrepld log shows error:

[2014-07-21T15:54:59+02:00] [OID] [NOTIFICATION:16] [] [OIDREPLD] [host: oid3.mycompany.com] [pid: 27343] [tid: 0] Main:: Starting OIDREPLD against oid3.mycompany.com:389...
[2014-07-21T15:55:06+02:00] [OID] [ERROR:8] [23005] [OIDREPLD] [host: oid3.mycompany.com] [pid: 27343] [tid: 0] Main:: gslmrplMain: No replication agreement to process

Going back and reviewing the spooled remtool.log from the remtool -paddnode (step 2, above), it shows:

Error occurred while adding partial replica ldap://oid3.mycompany.com:636.
ldap://oid1.mycompany.com:636 : Failed to modify entry orclagreementid=000002,orclreplicaid=oid1_db1,cn=replication configuration.
Type or value exists. Attribute orcllastappliedchangenumber is single valued.

Now the three active nodes all show failures when running remtool -pverify. For example, on the master node OID1 (oid1.mycompany.com):

    Test: lastAppliedChangeNumber (oid1_db1 to oid3_db31)
      Check: Format (transport) failed
      Check: Format (apply) failed

The same test failures occur in OID2 (oid2.mycompany.com) and OID4 (oid4.mycompany.com).

Finally, unable to remove the OID3 node with remtool -pdelnode either:

[oracle@oid3 repl_delnode]  > $ORACLE_HOME/ldap/bin/remtool -pdelnode -v -bind oid3:636
Enter replication dn password                :
------------------------------------------------------------------------------
Directory Replication Group (DRG) details :

--- ------------------ ----------------------- ----------------------- -----
Sl  Replicaid          Directory Information   Supplier Information    Repl.
No.                                                                    Type
--- ------------------ ----------------------- ----------------------- -----
001 oid3_db3           oid3:636                --                      RW

002 oid4_db4           oid4:389                oid2_db2                RW
                                         oid1_db1

003 oid1_db1           oid1:636                oid4_db4                RW
                                         oid2_db2

004 oid2_db2           oid2:636                oid4_db4                RW
                                         oid1_db1

--- ------------------ ----------------------- ----------------------- -----
Enter replicaid of the replica to be deleted : oid3_db31

------------------------------------------------------------------------------
------------------------------------------------------------------------------
Error occurred while deleting partial replica ldap://oid3:636(oid3_db31).
ldap://oid3:636 : Replica oid3_db3 is master replica and cannot be deleted.

 

Changes

Applied ACL changes which failed to replicate to one node.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms