Running a Spark Job From a Non-BDA Edge Node Fails With: "ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks" (Doc ID 2498643.1)

Last updated on JULY 20, 2024

Applies to:

Big Data Appliance Integrated Software - Version 4.7.0 and later
Linux x86-64

Symptoms

When a spark2 job is run from an edge node that is not on the BDA, it fails with an error like below:

ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to bdahostname.<DOMAIN>/<IP Address of BDA node>:<port number>

If the same job is run entirely on the BDA cluster (without the edge node), it works. Additionally if the spark2 job is run from the edge node using the "--deploy-mode cluster" option, (i.e. spark2-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster /opt/cloudera/parcels/CDH/lib/spark//lib/<your sparkapplication>), it runs successfully.

Cause

	To view full details, sign in with your My Oracle Support account.
	Don't have a My Oracle Support account? Click to get started!

In this Document

Symptoms

Cause

Solution

Steps to Setup the iptables Rules

Steps to Revert the iptables Rules

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.

Running a Spark Job From a Non-BDA Edge Node Fails With: "ERROR shuffle.RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks" (Doc ID 2498643.1)

Applies to:

Symptoms

Cause

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!