Code for running robust and repeatable LDA experiments. Use the -h flag to view CLI parameters for any script.
- lda: Files related to the training and analysis of LDA topic models
- dlda: Files related to the training and analysis of dynamic topic models (using
gensim'sldaseqimplementation) list_common_words.py: Takes an experiment config file as a command line argument and runs all specified preprocessing before listing the top 50 words in the dataset which will be used in that experimentplot_data_quants.py: Driver function to use aTextParserto make plots of the quantities of data in time frames (especially useful for deciding time intervals for a dynamic topic model)
Install our ogm package and its dependencies.