Skip to content

Latest commit

 

History

History
1 lines (1 loc) · 234 Bytes

File metadata and controls

1 lines (1 loc) · 234 Bytes

This is my first project in a Cloudera Quickstart Container. This is a low level approach of getting multiple large (100GB) files and combining them into hdfs. Uses multiprocessing and runs quickly. At most, this uses 7-8GB of memory.