My Oracle Support Banner

On BDA V4.x and Higher YARN Shuffle Tasks May Contribute to Cluster Instability Related to Journal Nodes/Zookeeper Servers (Doc ID 2444461.1)

Last updated on APRIL 17, 2023

Applies to:

Big Data Appliance Integrated Software - Version 4.5.0 and later
Linux x86-64

Goal

If you are seeing cluster instability related to Journal Nodes/Zookeeper servers the competing IO workload may be (at least partly) due to YARN Shuffle tasks, which use "yarn.nodemanager.local-dirs"/"yarn.nodemanager.log-dirs" for storing intermediate data. In BDA/BDCS clusters, this parameter includes /u01 and /u02 on the Journal Node and Zookeeper hosts. In the case of performance or instability issues related to Zookeeper servers or or Journal Nodes, try applying the workaround for this issue which is to :

  1. Remove /u01 and /u02 from the list of "yarn.nodemanager.local-dirs" on the Zookeeper/JournalNode hosts (typically the first 3 hosts) by adding overrides for these hosts in Cloudera Manager configuration.
  2. Remove /u01 and /u02 from the list of "yarn.nodemanager.log-dirs" on the Zookeeper/JournalNode hosts (typically the first 3 hosts) by adding overrides for these hosts in Cloudera Manager configuration.

This is tracked in an internal bug and there are plans to fix it in a future release.

For additional background and details see both:

Solution

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Goal
Solution
References

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.