This project tackles 4-way news topic classification on the AGNews on the classes World, Sports, Business, Sci/Tech. It contains the three deliverables required for Natural Language Processing (WBAI059-05).
- Teun Boersma (s5195179)
- Julian Sprietsma (s5096219)
- Marcus Harald Olof Persson (s5343798)
This project uses uv for dependency and environment management.
- Clone the project.
- Create a copy of example.config.yaml and rename it to
config.yaml.hf_tokenis not required.
- Sychronise the project.
uv sync- Run the project.
uv run main.py [--assignment] [--functionality]--assignment: The assignment to run. Dependent onfunctionality.--functionality: The functionality to run. Dependent onasssignment.
This project uses a CLI built with rich for a clean interface and to provide a fresh-out-of-the-box project immediately after cloning. Using the CLI, you can interact with all the required deliverables per assignment efficiently.
You can run the project as is by following the above instructions. Below, you can run specific deliverables for quick reference.
- Train and Evaluate Base Models:
uv run main.py --assignment 1 --functionality 1- Perform SVM Grid Search:
uv run main.py --assignment 1 --functionality 2- Analyze Errors on Models:
uv run main.py --assignment 1 --functionality 3- Examine Word Similarity:
uv run main.py --assignment 2 --functionality 1- Train and Evaluate CNN Model:
uv run main.py --assignment 2 --functionality 2- Train and Evaluate LSTM Model:
uv run main.py --assignment 2 --functionality 3- Analyze Errors:
uv run main.py --assignment 2 --functionality 4- Ablation Study on Sequence Length:
uv run main.py --assignment 2 --functionality 5- Finetune and Evaluate DistilBERT:
uv run main.py --assignment 3 --functionality 1- Robustness Evaluation:
uv run main.py --assignment 3 --functionality 2- Analyze Errors:
uv run main.py --assignment 3 --functionality 3