High Number of LWPs in AService or Imapd Processes
(Doc ID 1346043.1)
Last updated on JANUARY 03, 2023
Applies to:
Oracle Communications Messaging Server - Version 5.2.0 and later Information in this document applies to any platform.
Symptoms
The AService process ran at maxthreads (250) all day. Normally it is only around 40.
Response time for all traffic through the MMP may begin to increase.
The amount of CPU used by the AService process (as shown by prstat or top) increases and vmstat shows it mostly system (kernel) mode CPU rather than user mode.
A prstat -L shows LWP 1 of the AService process is consuming a percentage which is equal to 100% of one processor.
For example
On a T5220 which has 1 CPU, with 8 cores, and 8 threads per core = 64 virtual processors (ie. psrinfo | wc -l == 64) with each zone running MMP handling about 10,000 simultaneous connections, on one of those zones:
prstat
Note that other calls may happen more often, but it is poll in main/dispatch thread which is using all the CPU and that top shows it is all in system mode.
Difference between imapd and AService
This problem can occur in the imapd process going back to 5.0 (and probably Netscape Messaging Server 4.x).
It would not occur in the AService process (or not look exactly like above) until Messaging Server 7 update 2, when the MMP was changed to use synchronous processing for some functions, such as DNS lookups and initiating connections to the backend store systems. That change did not cause the problem. The same scalability issue has existed in both imapd and AService all along, but the symptoms in the MMP would not look the same as in imapd until after 7u2.
Changes
The problem may appear following an outage. For example, if all the backend message store systems are restarted at the same time, this would cause all of the clients to try to reconnect. The load of roughly simultaneous connection attempts creates a lot more load through the MMP, which may cause it to exhibit these symptoms. The same scenario could happen after a network outage is corrected. (Also see Doc ID 1277288.1)
Cause
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!