Executing a Spark Job on BDA V4.5 (Spark-on-Yarn) Fails with "org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow"

(Doc ID 2143437.1)

Last updated on MAY 31, 2016

Applies to:

Big Data Appliance Integrated Software - Version 4.5.0 and later
Linux x86-64

Symptoms

Executing a Spark Job on BDA V4.5/CDH 5.7.0 (Spark-on-Yarn) fails with "org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow".  The output looks like:

6 WARN scheduler.TaskSetManager: Lost task 0.3 in stage 2.0 (TID *, bdanode0x.example.com): org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 0, required: 3. To avoid this, increase spark.kryoserializer.buffer.max value.
  at org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:240)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

 

Cause

Sign In with your My Oracle Support account

Don't have a My Oracle Support account? Click to get started

My Oracle Support provides customers with access to over a
Million Knowledge Articles and hundreds of Community platforms