Oracle VM Server High Load Average (> 2000) After Removing A Lun From Storage

(Doc ID 2408906.1)

Last updated on JUNE 13, 2018

Applies to:

Oracle VM - Version 3.3.4 and later
Information in this document applies to any platform.

Symptoms

The load average on an OVM server stays high ex. above 2400.
top - 12:58:47 up 394 days, 22:21, 2 users, load average: 2452.28, 2451.72, 2451.42

We see more than 2000 processes running the command : /sbin/kpartx -a -p p /dev/dm-19

And they are all in status "D".

# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19408 1260 ? Ss 2017 0:01 /sbin/init
...
root 321 0.0 0.0 15064 840 ? D< May13 0:00 /sbin/kpartx -a -p p /dev/dm-19
root 340 0.0 0.0 15064 844 ? D< May12 0:00 /sbin/kpartx -a -p p /dev/dm-19
root 350 0.0 0.0 15064 848 ? D< May13 0:00 /sbin/kpartx -a -p p /dev/dm-19
root 351 0.0 0.0 15064 844 ? D< May11 0:00 /sbin/kpartx -a -p p /dev/dm-19
root 353 0.0 0.0 15064 848 ? D< May13 0:00 /sbin/kpartx -a -p p /dev/dm-19
...

We see repeating entries like the following in the /var/log/messages file:

May 13 05:48:36 MYSYSTEM multipathd: 65:32: mark as failed
May 13 05:48:36 MYSYSTEM multipathd: 360060e8012505f005040505f00000092: Entering recovery mode: max_retries=10
May 13 05:48:36 MYSYSTEM multipathd: 360060e8012505f005040505f00000092: remaining active paths: 0
May 13 05:48:36 MYSYSTEM multipathd: 360060e8012505f005040505f00000092: Entering recovery mode: max_retries=10
May 13 05:48:38 MYSYSTEM multipathd: 360060e8012505f005040505f00000092: sdbh - tur checker reports path is up
May 13 05:48:38 MYSYSTEM multipathd: 67:176: reinstated

 


 

Changes

 LUN 360060e8012505f005040505f00000092 was removed from the storage array without removing it from the Oracle VM  Manager first.

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms