The goal here should be: * [x] List all the data sets that are needed * [x] Give instructions on how to access the data * [ ] Give instructions on how to determine the file sizes for all the data sets
The goal here should be: