Oracle Linux: BDA Nodes Experiencing Performance Slowdowns, Network Timeouts and Command Hangs
(Doc ID 2582629.1)
Last updated on SEPTEMBER 21, 2021
Applies to:
Linux OS - Version Oracle Linux 7.5 with Unbreakable Enterprise Kernel [4.1.12] and laterInformation in this document applies to any platform.
Symptoms
After a BDA patch update that upgrades Oracle Linux 6.9 64bit with UEK3 to Oracle 7.5 with UEK4, all the cluster nodes experience performance slowdowns and network timeouts or network-related issues, particularly when having a high load of work running any application.
Some of the symptoms are:
- Commands like the ones below hang or take too long to respond:
# mount
# df -h
# id user - System logs nfs connection failure errors on /var/log/messages file like the one below:
Jul 20 05:01:46 hostname kernel: nfs: server 000.000.00.00 not responding, still trying
- Application jobs or tasks fail (particularly, BDA nodes runs Cloudera's YARN jobs).
The connection between the cluster nodes uses Infiniband network interfaces, the cluster communicates with ZFS over the Infiniband network too.
Changes
Nodes upgraded from Oracle Linux 6.9 with UEK R3 to Oracle Linux 7.5 with UEK R4 (4.1.12-124.14.1.el7uek.x86_64).
Cause
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Symptoms |
Changes |
Cause |
Solution |
References |