OID 10.1.4 Crashes Due to SASL Binds / Is Unstable / Ldapbinds Fail Intermittently With sgslufread: Hard error on read, OS error = 131 / ldap_bind: Can't contact LDAP server (Doc ID 790897.1)

Last updated on FEBRUARY 06, 2017

Applies to:

Oracle Internet Directory - Version 10.1.4.0.1 to 10.1.4.3 [Release 10gR3]
Information in this document applies to any platform.
***Checked for relevance on 08-Mar-2013***


Symptoms

Oracle Internet Directory (OID) 10.1.4.0.1 to 10.1.4.3.0 with OID standard/default Database version 10.1.0.5.0 on 64 bit platform, i.e., Sun Solaris SPARC (64 bit).

When binding to OID non-ssl port, intermittently seeing errors:

$ ldapbind -p <oid_nonssl_port>
ldap_bind: Can't contact LDAP server

$ ldapbind -p <oid_nonssl_port>
sgslufread: Hard error on read, OS error = 131
ldap_bind: Can't contact LDAP server


OID server is crashing, becomes unstable, and/or fails to start with following OID log errors:

ProcessDispatcher:1 * Starting OIDLDAP Server,PID=3333
ProcessDispatcher:1 * sgsluscSendPort: sendmsg failed, OS ERROR = 32
ProcessDispatcher:1 * send port for server PID : 3143 failed
ProcessDispatcher:1 * sgsluscSendPort: sendmsg failed, OS ERROR = 32
ProcessDispatcher:1 * sgslupsCheckProcess: sigsend to pid 3143 failed, os error = 3
ProcessDispatcher:1 * Server with PID = 3143 is not running
ProcessDispatcher:1 * Starting OIDLDAP Server,PID=3347


Or:

ProcessDispatcher:1 * sgslufread: Hard error on read, OS error = 131
ProcessDispatcher:1 * sgsluscSendPort: sendmsg failed, OS ERROR = 32
ProcessDispatcher:1 * sgslupsCheckProcess: sigsend to pid 26741 failed, os error = 3

 

Upon further research, the problem stems from a running application that attempts simple sasl ldapbinds.  The same bind via command line also reproduces the problem, i.e.:

ldapbind -h <oid_hostname> -p <oid_nonssl_port> -D cn=orcladmin -w <password> -O auth -Y DIGEST-MD5

An oid stack trace file is generated, and the contents matches existing <Bug 5709427> / base <Bug 5743928>, i.e.:

----- Call Stack Trace -----

calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
------------------   -------- -------------------- ----------------------------
, PC: [0xffffffff7c47716c, naemd5d()+76] 

sgslusz()+220        CALL     gslupsDumpProcStack  FFFFFFFF78DF1F78 ?
                              ()+0                 000000017 ? 10042A200 ?
                                                   FFFFFFFF78DF30A4 ?
                                                   FFFFFFFF78DF1568 ?
                                                   FFFFFFFF78DF1578 ?
__sighndlr()+12      PTR_CALL 0000000000000000     00000FC00 ? 10123E2A0 ?
                                                   1003EBCC0 ? 000000000 ?
                                                   10042A200 ? 00000000B ?
naemd5d()+76         PTR_CALL 0000000000000000     00000000B ? 000000000 ?
                                                   FFFFFFFF78DF3460 ?
                                                   10011D5C0 ? 000000000 ?
                                                   00000000A ?
naemd5m()+32         CALL     naemd5d()+0          000000040 ?
                                                   FFFFFFFF78E00000 ?
                                                   000000040 ?
                                                   FFFFFFFF78DF38A0 ?
                                                   000000000 ? 000000004 ?
...<snip>...


So, tried applying Database Merge <Patch 6494386> for the bug fixes above, but then, although the
ldapbinds do not crash OID, they dump core on the client side.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms