This repo contains the files for the Grduate Project for DATA 200 at Berkeley.
/analysiscontains all jupyter notebooks used for data processing, modeling, and analysis.- files named
initial_*are uncleaned initial data exploration - a subfolder
/analysis_figurescontains any photos included in the analysis notebooks
- files named
/datacontains all initial and processed data/figurescontains all figures created in the analysis notebooks that were saved for the report (plt.savefigwas used to save them)/narrativecontains the.texfiles used to compile the report, along withreport.pdfwhich is the final write up
Since some notebooks process data and then save the output to be used in other notebooks, the data processing pipeline is included below to enable replicaton:
covid_data_processing.ipynb,vaccine_data_processing.ipynb, andweather_data_processing.ipynbread and process the initial datadata_merge_and_eda.ipynbcombines the data outputed from the above and performs some exploratory data analaysisbaseline_modeling.ipynb,weather_modeling.ipynbanddeath_rate_models.ipynbperform the data analysis describe in the report using the aggregated data from the previous step