Sun Fire[TM] 15K/12K/E20K/E25K Servers: RCM Daemon hanging, causing DR operations to hang
Last updated on MAY 01, 2017
Applies to:Sun Fire 15K Server - Version All Versions and later
Sun Fire E25K Server - Version All Versions and later
Remote Dynamic Reconfiguration(DR) operation(from the System Controller), or local DR operation(from the domain), works fine until a DR operation does not respond; reporting in the $SMSLOGGER/domain_Id/messages file, messages like:
DCA/DCS communication error
dca[...]-S(): [... ERR DCSInterface.cc 378] message receive failed: DCSInterface :: receiveResponse errCode:502
In some cases, it may not be possible to kill the associated process(cfgadm, rcfgadm, deleteboard, showdevices).
Note: for background information on the DR mechanism, please refer to Document 1003582.1 (What Happens in a Sun Fire[TM] 15K/12K DR Slot0 Detach Operation).
Note: if none of the symptoms described below is true, and the signature of the rcm_daemon stack is not the same as described in this article, then it's more likely that you are facing a different issue. See the References section below for more troubleshooting steps.
Sign In with your My Oracle Support account
Don't have a My Oracle Support account? Click to get started
My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms