Pyspark Run on BDA Edge Nodes Fails in Distributed Mode Works but Works in Local Mode (Doc ID 2068501.1)

Last updated on NOVEMBER 19, 2015

Applies to:

Big Data Appliance Integrated Software - Version 4.2.0 and later
Linux x86-64

Symptoms

Launching a Pyspark job from a BDA edge node in distributed mode is returning the following error:

Error from python worker:
python: module pyspark.daemon not found
PYTHONPATH was:
/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar
java.io.EOFException

 

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms