Imapd Core Caused Ims_master Race Condition and Queue Buildup
Last updated on MAY 30, 2018
Applies to:Oracle Communications Messaging Server - Version 8.0.2 to 8.0.2 [Release 8.0.0]
Information in this document applies to any platform.
An imapd core dump resulted in the ims-ms channel being completely blocked because watcher did not also restart the ims_master processes. A store-related core should take the necessary action for ims_master (and tcp_lmtp_server). This turns a bad problem (the imapd core) into a worse problem: even after automatic recovery from the imapd core, delivery is blocked until manual intervention.
A few minutes after the core dump, an alert was received about the queue. The ims-ms channel had not performed any deliveries since the core and there were 21,000 messages in the queue.
All the ims_master processes had been started prior to the core dump. A pstack on one of them showed it was blocked in mboxlist_close. A pkill -9 on ims_master, of course caused another restart, which seemed to be needed. Unfortunately, job_controller was able to start more ims_master processes before watcher stopped stored. So it resulted in exactly the same situation again.
It was finally cleared by doing:
1. imsimta stop job_controller
2. stop-msg store
3. pkill -9 ims_master
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms