Solaris 11 Format Command on Guest LDOM May Hang When LUNS are Under MPXIO in Control Domain
(Doc ID 2233596.1)
Last updated on JULY 12, 2020
Applies to:Solaris Operating System - Version 11.2 and later
Information in this document applies to any platform.
1) Solaris guest LDom hang may happen while issuing SCSI Persistent Reserve Out (PROUT) to storage target.
LDom software is known to use PROUT, it seems the hang is caused by the LDOM commands which are using the scsi3 reservation.
Should the issue happen, I/O is found to be blocked in the SCSI driver layer
2) In order to troubleshoot this problem, we can we generate a crashdump from one of the guest domains that is hanging :
On the control domain run :
# ldm panic [Guest Domain Name]
Then gather a crashdump from the control domain with :
# reboot -d
Once the control domain and guest are back , up and running, please send in the 2 crashdump files.
3) Here is a problem example Solaris 11.3 SRU 5.6.0 T5-2 sparc control domain server ,
with Oracle Emulex FC HBAs connected to the SAN accessing to an IBM disk storage array and another EMC SYMMETRIX array
all LUNs under mpxio control.
The guest LDom access to the IBM and EMC LUNs/disks as vdisk served by the control domain.
With no apparent reason, no other errors reported on the messages or fmdump, some of the guest ldoms are now hanging
Besides on the Control Domain format is hanging and other storage related commands ie. mpathadm are hanging or running slow.
Customer paniced the control domain (reboot -d) due to slow applications / hung of the guest LDom.
A Solaris crash dump was collected, and we can see many busy devices and threads waiting in biowait.
On act output of the crash dump we can see several threads are stuck in scsi_vhci prout ( pgr reservation ) processing
this is enough to match this crash to the bug 21799369
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document