My Oracle Support Banner

How To Merge Small Parquet Fles into Larger Files. (Doc ID 2263910.1)

Last updated on JANUARY 14, 2020

Applies to:

Big Data Appliance Integrated Software - Version 4.5.0 and later
Linux x86-64


The question raised here is how to merge small parquet files created by Spark into bigger ones.

By default Spark creates 200 reducers and in turn creates 200 small files.

There is a solution available to combine small ORC files into larger ones, but that does not work for parquet files.

For example for ORC you can use:

ALTER TABLE table_name [PARTITION (partition_key = 'partition_value')] CONCATENATE



To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!

In this Document

My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.