Load Imbalance Between LMS's Due to Higher CPU in ib_ipoib & mlx4_ib on UEK1 and OFED 1.5
(Doc ID 2205196.1)
Last updated on AUGUST 04, 2018
Applies to:Linux OS - Version Oracle Linux 5.5 with Unbreakable Enterprise Kernel [2.6.32] to Oracle Linux 5.11 with Unbreakable Enterprise Kernel [2.6.32] [Release OL5U5 to OL5U11]
On an Exadata X3-2 platform running 2.6.32-400.1.1.el5uek and OFED 1.5.5, running the LTE benchmark with 2 nodes, 8 LMSs running on each, RAC performance issues are experienced.
Although client requests are distributed equally among all 8 LMSs, a LMS process becomes busier than others, becoming the performance bottleneck.
One LMS becomes over 90% busy, while the others are around 30% busy.
The busy LMS is not always the same, and it fluctuates as the workload progresses.
The utime & stime columns in /proc/pid/stat reveal that the busy LMS (ospid 46154 in the following case) is spending a lot more time in the kernel than the other LMSs.
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!