Connections to Dispatcher Hang and May Be in CLOSE_WAIT (Doc ID 1581632.1)

Last updated on JUNE 29, 2017

Applies to:

Oracle Communications Messaging Server - Version 5.2.0 and later
All Platforms

Symptoms

On Solaris 11 update 1, during SMTP load testing, the dispatcher builds up TCP socket connection file descriptors in CLOSE_WAIT state until it reaches MAX_HANDOFFS, at which point it will not accept any more connections. Restarting dispatcher briefly clears the situation.

The client connects and never receives the banner. A normal client would time out and retry. End users may get a "connection failed" type error. Remote MTAs may see this as poor performance.

A load test program that does not timeout will see this as a hung connection or hung test thread.

Watching the dispatcher with truss, or enabling debugs, slows it down enough to prevent the problem from happening.

Watching the dispatcher with truss -t sendmsg may see sendmsg() fail with errno=11, EAGAIN -- if the system is fast enough and that truss does not slow it down too much.

Watching the dispatcher with dtrace -n 'syscall::sendmsg:return/arg0/{printf("pid=%d errno=%d\n",pid,errno);}' -p <dispatcher-pid>; shows sendmsg() failing with errno=11, EAGAIN.

Changes

Running on Solaris 11 update 1.

Solaris 11 became a "supported platform" as of Communications Suite 7.0.5 (which contains Messaging Server 7.0.5.29):

Solaris 11 will certainly never be a supported platform for Messaging Server 5.2.  However, it was possible to run Messaging Server 5.2 on Solaris 10. And this problem could happen in that scenario. Note: Messaging Server 5.2 is long past End of Support Life.  This problem was originally observed with Messaging Server 7.0.5.28 running in a Solaris 10 Branded Zone on Solaris 11. 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms