How To Merge Small Parquet Fles into Larger Files.
(Doc ID 2263910.1)
Last updated on MAY 09, 2017
Applies to:Big Data Appliance Integrated Software - Version 4.5.0 and later
The question raised here is how to merge small parquet files created by Spark into bigger ones.
By default Spark creates 200 reducers and in turn creates 200 small files.
There is a solution available to combine small ORC files into larger ones, but that does not work for parquet files.
For example for ORC you can use:
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document