Load Imbalance Between LMS's Due to Higher CPU in ib_ipoib & mlx4_ib on UEK1 and OFED 1.5 (Doc ID 2205196.1)

Last updated on NOVEMBER 30, 2016

Applies to:

Linux OS - Version Oracle Linux 5.5 with Unbreakable Enterprise Kernel [2.6.32] to Oracle Linux 5.11 with Unbreakable Enterprise Kernel [2.6.32] [Release OL5U5 to OL5U11]
Linux x86-64

Symptoms

On an Exadata X3-2 platform running 2.6.32-400.1.1.el5uek and OFED 1.5.5,  running the LTE benchmark with 2 nodes, 8 LMSs running on each, RAC performance issues are experienced.
Although client requests are distributed equally among all 8 LMSs, a LMS process becomes busier than others, becoming the performance bottleneck.
One LMS becomes over 90% busy, while the others are around 30% busy.
The busy LMS is not always the same, and it fluctuates as the workload progresses.
The utime & stime columns in /proc/pid/stat reveal that the busy LMS (ospid 46154 in the following case) is spending a lot more time in the kernel than the other LMSs.

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms