On BDA V4.x and Higher YARN Shuffle Tasks May Contribute to Cluster Instability Related to Journal Nodes/Zookeeper Servers
(Doc ID 2444461.1)
Last updated on JULY 20, 2024
Applies to:
Big Data Appliance Integrated Software - Version 4.5.0 and laterLinux x86-64
Goal
If you are seeing cluster instability related to Journal Nodes/Zookeeper servers the competing IO workload may be (at least partly) due to YARN Shuffle tasks, which use "yarn.nodemanager.local-dirs"/"yarn.nodemanager.log-dirs" for storing intermediate data. In BDA/BDCS clusters, this parameter includes /u01 and /u02 on the Journal Node and Zookeeper hosts. In the case of performance or instability issues related to Zookeeper servers or or Journal Nodes, try applying the workaround for this issue which is to :
- Remove /u01 and /u02 from the list of "yarn.nodemanager.local-dirs" on the Zookeeper/JournalNode hosts (typically the first 3 hosts) by adding overrides for these hosts in Cloudera Manager configuration.
- Remove /u01 and /u02 from the list of "yarn.nodemanager.log-dirs" on the Zookeeper/JournalNode hosts (typically the first 3 hosts) by adding overrides for these hosts in Cloudera Manager configuration.
This is tracked in an internal bug and there are plans to fix it in a future release.
For additional background and details see both:
- Understanding the Oracle Big Data Appliance V3.* and Higher Service Role Layout for NameNodes, DataNodes, Yarn services , Journal Nodes and Zookeeper (Doc ID 2045179.1).
- Oracle Big Data Appliance Frequently Asked Questions v2.x and Higher (FAQ) (Doc ID 1518525.1), the Q/A on: "Similar to the previous question why is the Zookeeper data directory configured to be on the root ("/") mount point?"
Solution
To view full details, sign in with your My Oracle Support account. |
|
Don't have a My Oracle Support account? Click to get started! |
In this Document
Goal |
Solution |
References |