How To Merge Small Parquet Fles into Larger Files.
(Doc ID 2263910.1)
Last updated on JANUARY 14, 2020
Applies to:Big Data Appliance Integrated Software - Version 4.5.0 and later
The question raised here is how to merge small parquet files created by Spark into bigger ones.
By default Spark creates 200 reducers and in turn creates 200 small files.
There is a solution available to combine small ORC files into larger ones, but that does not work for parquet files.
For example for ORC you can use:
To view full details, sign in with your My Oracle Support account.
Don't have a My Oracle Support account? Click to get started!
In this Document