We want our cleaning script to: 1) Take in a chunk of news files (one or more). 2) Treat all news articles as equals 3) create an initial (unfiltered) vocabulary dictionary. This dictionary should be like: {"a_word": number_of_occurances, ... }
We want our cleaning script to:
This dictionary should be like:
{"a_word": number_of_occurances, ... }