Make Spark to compact small files like Hive, support Parquet/ORC/RC file, set spark.sql.sources.commitProtocolClass=com.github.yantzu.compaction.CompactFilesCommitProtocol to enable.
yantzu/spark-compaction-plugin
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|