OID 11g and 12c Connection Timeouts, Crash or Hang - gsluslldConnect: Failed to connect <HOSTNAME>:<PORT> / OID Continuously Pings a Deprecated Host Leading to Crashes
(Doc ID 1607479.1)
Last updated on AUGUST 30, 2023
Applies to:
Oracle Internet Directory - Version 11.1.1.7.0 and laterInformation in this document applies to any platform.
Symptoms
Oracle Internet Directory (OID) 11g or 12c periodically becomes unresponsive , will not accept modifications, hangs or occasionally creates stack dumps and run-away log files. Symptoms would include client applications reporting "Can't contact LDAP server".
The following error occurs repeatedly in the oidmon-0000.log:
[2013-10-22T11:19:43-05:00] [OID] [NOTIFICATION:16] [] [OIDMON] [host: <OID_HOSTNAME>] [pid:<PID>] [tid: xx] Guardian: Warning: Node=<BADHOSTNAME> is not responding .., last update Time=20130511020119Z
[2013-10-22T11:20:59-05:00] [OID] [NOTIFICATION:16] [] [OIDMON] [host: <OID_HOSTNAME>] [pid:<PID>] [tid: xx] Guardian: Warning: Node=<BADHOSTNAME> is not responding .., last update Time=20130511020119Z
..
[2013-10-22T12:17:50-05:00] [OID] [NOTIFICATION:16] [] [OIDMON] [host: <OID_HOSTNAME>] [pid:<PID>] [tid: xx] Guardian: gslsgldConnect: Failed to connect <BADHOSTNAME>:<PORT>
[2013-10-22T12:17:50-05:00] [OID] [NOTIFICATION:16] [] [OIDMON] [host: <OID_HOSTNAME>] [pid:<PID>] [tid: xx] Guardian: gslsgrnPing: Failed to connect to OID host:<BADHOSTNAME> port:<PORT> ssl:0
Occasionally OID may crash with a run-away log condition and produce stack dumps. For example:
oidldapd01s<PID>-0000.log:
----------------------------------
gslusrnChkForNewNodes:Adding new node: host=<OIDHOSTNAME2>, ipaddr=, port=<PORT>
gslusrnChkForNewNodes:Adding new node: host=<BADHOSTNAME>, ipaddr=, port=<PORT>
gsluslldConnect: Failed to connect <BADHOSTNAME>:<PORT>
gsluslldConnect: Failed to connect <BADHOSTNAME>:<PORT>
gsluslldConnect: Failed to connect <BADHOSTNAME>:<PORT>
gsluslldConnect: Failed to connect <BADHOSTNAME>:<PORT>
gslusrnChkConn2RemNode:Connect to OID on host:<BADHOSTNAME> failed
oidldapd_stack00_<PID>.dmp :
-----------------------------------------
calling caller argument values in hex
location (? means dubious value)
--------------------------------------------------------------------
gslusdsDumpStack() _ZN2os4Hpux15chained_handlerEiP9__siginfoPv()+560
_ZN2os4Hpux15chained_handlerEiP9__siginfoPv()_ZN2os4Hpux22JVM_handle_hpux_signalEiP9__siginfoPvi()+960
_ZN2os4Hpux22JVM_handle_hpux_signalEiP9__siginfoPvi()_ZN2os4Hpux13signalHandlerEiP9__siginfoPv()+128
_ZN2os4Hpux13signalHandlerEiP9__siginfoPv()<kernel>
<kernel> _poll_sys()+48
_poll_sys() _poll()+224
_poll() sgslufWriteEnabled()+192
sgslufWriteEnabled()sgslunvWriteEnabled()+384
sgslunvWriteEnabled()sgslunnNewCtn()+2416
sgslunnNewCtn() gslcoic_ConnectToHost()+1168
gslcoic_ConnectToHost()gslcopc_OpenLdapConnection()+1456
gslcopc_OpenLdapConnection()gslconn_NewConnection()+528
gslconn_NewConnection()gslcopd_LdapOpenDefConn()+304
gslcopd_LdapOpenDefConn()gslcopo_LdapOpen()+208
gslcopo_LdapOpen() ldap_open()+176
ldap_open() gsluslldConnect()+256 <<-----Crashing function from OID Server log
gsluslldConnect() gslusrnConnect2RemNode()+1120
gslusrnConnect2RemNode()gslusrnChkConn2RemNode()+144
gslusrnChkConn2RemNode()gslarscNotifyController()+176
gslarscNotifyController()__pthread_bound_body()+400
Or with heavy trace debugging enabled, messages like:
oidldapd01s<PID>-0000.log:
----------------------------------
[2013-10-22T12:18:06-05:00] [OID] [NOTIFICATION:16] [] [OIDLDAPD] [host: <OID_HOSTNAME>] [pid: <PID>] [tid: xx] [ecid:<ECID>] ServerWorker (REG):[[
BEGIN
ConnID:4512 mesgID:475092 OpID:1 OpName:bind ConnIP:<IP_ADDRESS> ConnDN:cn=<USERNAME>,cn=application service accounts,cn=users,dc=<COMPANY NAME>,dc=com
gslusldSendExtOp: LDAP result failed rc:81 host:<BADHOSTNAME> port:<PORT>
END
]]
[2013-10-22T12:18:06-05:00] [OID] [NOTIFICATION:16] [] [OIDLDAPD] [host: <OID_HOSTNAME>] [pid: <PID>] [tid: xx] [ecid:<ECID>] ServerWorker (REG):[[
BEGIN
ConnID:4512 mesgID:475092 OpID:1 OpName:bind ConnIP:<IP_ADDRESS> ConnDN:cn=<USERNAME>,cn=application service accounts,cn=users,dc=<COMPANY NAME>,dc=com
gslusrnWriteToRemNodes: Notfy to host:<BADHOSTNAME> failed
END
]]
[2013-10-22T12:18:06-05:00] [OID] [NOTIFICATION:16] [] [OIDLDAPD] [host: <OID_HOSTNAME>][pid: <PID>] [tid: xx] [ecid:<ECID>] ServerWorker (REG):[[
BEGIN
ConnID:4512 mesgID:475092 OpID:1 OpName:bind ConnIP:<IP_ADDRESS> ConnDN:cn=<USERNAME>,cn=application service accounts,cn=users,dc=<COMPANY NAME>,dc=com
sgslufread: Hard error on read, OS error = 32
END
]]
This will also cause client-side errors such as connection errors and response read timeouts from applications like OAM eg.:
LDAP Error 1 : LDAP response read timed out, timeout used:30000ms.
You may get core files generated in addition or instead of OID stack logs (and the oidldapd logs may show no errors). A stack trace from a core file can show:
...
Core was generated by `oidldapd '.
Program terminated with signal 11, Segmentation fault.
#0 0x00007ffd0ba522a7 in ?? ()
"<path>/core.<ldap_server_pid>" is a core file.
Another potential stack reported (from internal/unpublished/duplicate Bug 31710789 - dr oid core dumping for ldapmodify commands):
...
#4 0x000000000041a334 in gslusltOldLogFunction ()
#5 0x000000000042d908 in gslutcTrace ()
#6 0x0000000000546fe3 in gslusrnWriteToRemNodes ()
#7 0x0000000000551dc1 in gslecDelAttrValFromRS ()
#8 0x00000000005519a6 in gslecDeleteRS ()
#9 0x00000000005bdd84 in gslsbmModify ()
#10 0x00000000005a9158 in gslfmeADoModify ()
#11 0x00000000004b8945 in gslarswWorker ()
#12 0x0000003b6800683d in start_thread () from /lib64/libpthread.so.0
#13 0x0000003b670d518d in clone () from /lib64/libc.so.6
Or similar:
gslusdsDumpStack()
__restore_rt()
gslusrnWriteToRemNodes()+461
gslecDelAttrValFromRS()+353
gslecDeleteRS()+550
gslsbmModify()+7538
gslfmeADoModify()+3249
gslarswWorker()+6456
start_thread()+209
clone()+109
Changes
OID had been started with different hostnames at various times. Some of these hostnames were no longer resolvable or not responding.
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |