My Oracle Support Banner

Load Imbalance Between LMS's Due to Higher CPU in ib_ipoib & mlx4_ib on UEK1 and OFED 1.5 (Doc ID 2205196.1)

Last updated on FEBRUARY 01, 2019

Applies to:

Linux OS - Version Oracle Linux 5.5 with Unbreakable Enterprise Kernel [2.6.32] to Oracle Linux 5.11 with Unbreakable Enterprise Kernel [2.6.32] [Release OL5U5 to OL5U11]
Oracle Cloud Infrastructure - Version N/A and later
Linux x86-64

Symptoms

On an Exadata X3-2 platform running 2.6.32-400.1.1.el5uek and OFED 1.5.5,  running the LTE benchmark with 2 nodes, 8 LMSs running on each, RAC performance issues are experienced.
Although client requests are distributed equally among all 8 LMSs, a LMS process becomes busier than others, becoming the performance bottleneck.
One LMS becomes over 90% busy, while the others are around 30% busy.
The busy LMS is not always the same, and it fluctuates as the workload progresses.
The utime & stime columns in /proc/pid/stat reveal that the busy LMS (ospid 46154 in the following case) is spending a lot more time in the kernel than the other LMSs.

PID Usr Sys
46150 8794 7253
46154 6411 33716 <<<<
46158 8458 7450
46162 8848 7748
46166 8932 7470
46170 7754 7147
46174 8509 7158
46178 8831 8437

 

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Symptoms
Cause
Solution
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.