My Oracle Support Banner

After Removing from DRG and Reinstalling an OID 11g Replica Node (Due to failed ACL Change Replication, which Locked Out All Users with LDAP: error code 50 - Insufficient Access Rights), Replication Log Shows: No replication agreement to process (Doc ID 1910453.1)

Last updated on MAY 05, 2021

Applies to:

Oracle Internet Directory - Version 11.1.1 and later
Information in this document applies to any platform.

Symptoms

Four-node farm of Oracle Internet Directory (OID) 11g with Multimaster Replication:
Primary:     OID1 (<OID1_HOSTNAME>)
Secondary: OID2 (<OID2_HOSTNAME>)
Secondary: OID3 (<OID3_HOSTNAME>)
Secondary: OID4 (<OID4_HOSTNAME>)

A change to the access control parameters on the primary node (OID1) replicated correctly but only to two of the three other nodes (OID2 and OID4), but partially failed on node OID3, causing all users to be locked out, including cn=orcladmin.  There appears to have been a corruption of the access control information.

Now on OID3, unable to perform any operations with the cn=orcladmin superuser account, whether logging in through Oracle Directory Services Manager (ODSM) or running LDAP utilities from the command line.  For example, a login attempt to ODSM returns the error:

Could not load the schema from the OID server. Closed the connection. Search Failed. Host='<OID3_HOSTNAME>' Details: [LDAP: error code 50 - Insufficient Access Rights]

This error is occurring for all users on this one node.

All troubleshooting suggestions from MOS Notes so far require a login with cn=orcladmin so have not been helpful.

 

Since it was evident that cn=orcladmin had lost all privileges, including to the root of the OID server, removed the node from the Directory Replication Group (DRG) and reinstalled it from scratch.  Now, however, attempts to re-add the node to the DRG fail, i.e.:

1) Ran remtool -pverify on all four nodes and ensured that all checks were successful.
2) Ran remtool -paddnode to add the reinstalled node to the DRG (type 3, multimaster).
3) Set the orclreplicastate on oid3 to 0 using ldapmodify for replication bootstrap.
4) Started the replication service on oid3 with oidctl.

The oidrepld log shows error:

[2014-07-21T15:54:59+02:00] [OID] [NOTIFICATION:16] [] [OIDREPLD] [host: <OID3_HOSTNAME>] [pid: <PID>] [tid: <TID>] Main:: Starting OIDREPLD against <OID3_HOSTNAME>:<OID_PORT>...
[2014-07-21T15:55:06+02:00] [OID] [ERROR:8] [23005] [OIDREPLD] [host: <OID3_HOSTNAME>] [pid:<PID>] [tid:<TID>] Main:: gslmrplMain: No replication agreement to process

Going back and reviewing the spooled remtool.log from the remtool -paddnode (step 2, above), it shows:

Error occurred while adding partial replica ldap://<OID3_HOSTNAME>:<OID_SSL_PORT>.
ldap://<OID1_HOSTNAME>:<OID_SSL_PORT> : Failed to modify entry orclagreementid=<ID>,orclreplicaid=<REPLICA_ID1>,cn=replication configuration.
Type or value exists. Attribute orcllastappliedchangenumber is single valued.

Now the three active nodes all show failures when running remtool -pverify. For example, on the master node OID1 (<OID1_HOSTNAME>):

    Test: lastAppliedChangeNumber (<REPLICA_ID1> to <REPLICA_ID3>)
      Check: Format (transport) failed
      Check: Format (apply) failed

The same test failures occur in OID2 (<OID2_HOSTNAME>) and OID4 (<OID4_HOSTNAME>).

Finally, unable to remove the OID3 node with remtool -pdelnode either:

[oracle@oid3 repl_delnode]  > $ORACLE_HOME/ldap/bin/remtool -pdelnode -v -bind <OID3_HOSTNAME>:<OID_SSL_PORT>
Enter replication dn password                :
------------------------------------------------------------------------------
Directory Replication Group (DRG) details :

--- ------------------ ----------------------- ----------------------- -----
Sl  Replicaid          Directory Information   Supplier Information    Repl.
No.                                                                    Type
--- ------------------ ----------------------- ----------------------- -----
001 <REPLICA_ID3>           <OID3_HOSTNAME>:<OID_SSL_PORT>                --                                 RW

002 <REPLICA_ID4>          <OID4_HOSTNAME>:<OID_NONSSL_PORT>      <REPLICA_ID2>                RW
                                         <OID1_HOSTNAME>

003 <REPLICA_ID1>           <OID1_HOSTNAME>:<OID_SSL_PORT>            <REPLICA_ID4>                RW
                                         <OID2_HOSTNAME>

004 <REPLICA_ID2>           <OID2_HOSTNAME>:<OID_SSL_PORT>            <REPLICA_ID4>                RW
                                         <OID1_HOSTNAME>

--- ------------------ ----------------------- ----------------------- -----
Enter replicaid of the replica to be deleted : <REPLICA_ID3>

------------------------------------------------------------------------------
------------------------------------------------------------------------------
Error occurred while deleting partial replica ldap://<OID3_HOSTNAME>:<OID_SSL_PORT>(<REPLICA_ID3>).
ldap://<OID3_HOSTNAME>:<OID_SSL_PORT> : Replica <REPLICA_ID3> is master replica and cannot be deleted.

 

Changes

Applied ACL changes which failed to replicate to one node.

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Changes
Cause
Solution
References


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.