Skip to content

Latest commit

 

History

History
33 lines (20 loc) · 860 Bytes

File metadata and controls

33 lines (20 loc) · 860 Bytes

GPTAugmentation

GPTAugmentation is the github repository for the paper "Generative approach to NLP data augmentation using GPT-Neo"

Data Folder -clean_data: contains data from all augmentation methods after cleaning

-eda: raw eda data before cleaning
-gen_data: raw gpt data before cleaning 
-subsets: subsets used for generating data augmentation

Imgs Folder for result graphs

Results Csv files with results

How to clean data: -use functions in clean_data.py

How to generate augmented data: -run eda.py in modules folder for eda data -run bert.py in modules folder for bert data -run gpt_samples.py in modules to generate gpt samples

How to get results: -run run_model.py for BERT and GPT results which will be saved in results and imgs folders -run run_model_eda.py for EDA results which will be saved in results and imgs folders