My Oracle Support Banner

After Upgrade to BDA V4.5 from BDA V4.4 with Big Data SQL Installed Spark Jobs Fail Due to Reduced Container Memory and Reduced Cgroup Memory Hard Limit (Doc ID 2172153.1)

Last updated on JANUARY 02, 2020

Applies to:

Big Data Appliance Integrated Software - Version 4.5.0 to 4.5.0 [Release 4.5]
Linux x86-64


After upgrade to BDA V4.5 from BDA V4.4 with Big Data SQL installed Spark jobs fail with an error like:

An error occurred while calling <STRING>.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task <TASK_NUM> in stage 130.0 failed 4 times, most recent failure: Lost task 117.3 in stage 130.0 (TID <TID>, <HOSTNAME0x>.<DOMAIN>): ExecutorLostFailure (executor 124 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 1.5 GB of 1.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

Additional symptoms:

1. Container memory of the Nodes was reduced during the upgrade as below.

Property                                     Value After Upgrade       Value Before Upgrade
----------------                             -------------------------     ---------------------------------------

Container Memory                       17664 MiB                     35328 MiB

Cgroup Memory Hard Limit            17664 MiB                     35328 MiB

The changes are seen in CM by navigating: yarn > Configuration > History and Rollback.


To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!

In this Document
 Proactive Solution
 Reactive Solution

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.