Option B: Focused DistilBERT Exploration

Student: Martynas Prascevicius Student ID: 001263199 Course: COMP1818 Artificial Intelligence Applications

Summary

Total Experiments: 11 Training Time: ~8-10 hours (Mac M4) Focus: Core hyperparameter exploration for 4-page report

Baseline Results ✓

Experiment: baseline_default

Accuracy: 90.77%
Precision: 91.76%
Recall: 89.58%
F1: 90.66%
Training Time: 84 minutes
Best Epoch: 2 (Val Acc: 91.02%)

Remaining Experiments

Phase 2: Learning Rate Exploration (4 experiments)

Why Important: Learning rate is the MOST critical hyperparameter for transformer fine-tuning.

Experiment	LR	Expected Accuracy	Time
lr_1e5	1e-5	~90.0% (too conservative)	84 min
lr_2e5	2e-5	90.77% (baseline)	84 min
lr_3e5	3e-5	~91.5% (likely best)	84 min
lr_5e5	5e-5	~89.5% (too aggressive)	84 min

Total Phase 2 Time: ~5.6 hours

Run Command:

cd /Users/m2000uk/Desktop/coding/AI
source venv/bin/activate
cd CW2
python3 src/experiment_runner.py --phase 2

Phase 3: Batch Size Exploration (3 experiments)

Why Important: Batch size affects training speed, memory usage, and generalization.

Experiment	Batch Size	Expected Accuracy	Time
batch_8	8	~90.5% (slower, noisier)	140 min
batch_16	16	90.77% (baseline)	84 min
batch_32	32	~91.0% (faster, smoother)	50 min

Total Phase 3 Time: ~4.5 hours

Run Command:

python3 src/experiment_runner.py --phase 3

Phase 4: Training Duration (3 experiments)

Why Important: Determines optimal training length vs. overfitting.

Experiment	Epochs	Expected Accuracy	Time
epochs_3	3	90.77% (baseline)	84 min
epochs_4	4	~91.2% (optimal)	112 min
epochs_5	5	~90.8% (overfitting)	140 min

Total Phase 4 Time: ~5.5 hours

Run Command:

python3 src/experiment_runner.py --phase 4

Total Time Estimate

✓ Phase 1: 84 min (DONE)
Phase 2: 336 min (~5.6 hours)
Phase 3: 274 min (~4.5 hours)
Phase 4: 336 min (~5.5 hours)

Grand Total: ~16 hours

What Happens After All Experiments Complete

1. You Tell Me "Experiments Done"

I will:

✅ Analyze Results

Read all 11 results/*.json files
Compare all experiments
Find best configuration
Calculate improvements over baseline

✅ Generate Visualizations (programmatically)

Figure 1: Learning rate impact on accuracy
Figure 2: Batch size comparison
Figure 3: Training duration analysis
Figure 4: Training curves for best model
Figure 5: Confusion matrix for best model
Table 1: All 11 experiments comparison
Table 2: Best configuration details

✅ Write 4-Page LaTeX Report

Abstract
Introduction (DistilBERT architecture)
Background (Transformer attention mechanism)
Methodology (4 phases, 11 experiments)
Results (analysis of all phases)
Discussion (insights with literature citations)
Conclusion (best config, future work)
References (10 papers)

✅ Create Overleaf Package

Prascevicius_Martynas_DistilBERT.tex
COMPXXXX.cls and COMPXXXX.bst
references_distilbert.bib
All 5 figures as PDFs
Ready to upload and compile

2. Demo & Presentation

I will create:

✅ Demo Script (demo_inference.py)

Load best trained model
Run inference on sample reviews
Show predictions and explanations
Quick (~2-3 minutes)

✅ Presentation Guide (PRESENTATION_GUIDE.md)

5-minute video recording script
Slide-by-slide talking points
Key results to highlight
Demo execution steps
Q&A preparation

Expected Best Configuration

Based on literature and your baseline:

Predicted Optimal Settings:

Learning Rate: 3e-5 (Devlin et al. 2018 recommendation)
Batch Size: 32 (smoother gradients, faster)
Epochs: 4 (balance training vs overfitting)

Expected Accuracy: 91.5-92.0% (improvement of ~1% over baseline)

Academic References (10 Papers)

All cited in literature/references_distilbert.bib:

Vaswani et al. (2017) - Attention is All You Need
Devlin et al. (2018) - BERT
Sanh et al. (2019) - DistilBERT
Maas et al. (2011) - IMDB dataset
Howard & Ruder (2018) - ULMFiT
Liu et al. (2019) - RoBERTa
Loshchilov & Hutter (2017) - AdamW
Sun et al. (2019) - BERT fine-tuning
Reimers & Gurevych (2019) - Sentence-BERT
Smith et al. (2017) - Batch size effects

Running All Remaining Experiments at Once

If you want to run all phases overnight:

cd /Users/m2000uk/Desktop/coding/AI
source venv/bin/activate
cd CW2

# Run phases 2-4 sequentially
python3 src/experiment_runner.py --phase 2
python3 src/experiment_runner.py --phase 3
python3 src/experiment_runner.py --phase 4

# Or create a script to run all
echo "python3 src/experiment_runner.py --phase 2" > run_all.sh
echo "python3 src/experiment_runner.py --phase 3" >> run_all.sh
echo "python3 src/experiment_runner.py --phase 4" >> run_all.sh
chmod +x run_all.sh
./run_all.sh

WARNING: This takes ~16 hours. Run overnight!

Student Information

Name: Martynas Prascevicius Student ID: 001263199 University: University of Greenwich (2025-26) Deadline: Nov 19, 2025, 5pm (Grace: Nov 21, 2025, 5pm)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option B: Focused DistilBERT Exploration

Summary

Baseline Results ✓

Remaining Experiments

Phase 2: Learning Rate Exploration (4 experiments)

Phase 3: Batch Size Exploration (3 experiments)

Phase 4: Training Duration (3 experiments)

Total Time Estimate

What Happens After All Experiments Complete

1. You Tell Me "Experiments Done"

2. Demo & Presentation

Expected Best Configuration

Academic References (10 Papers)

Running All Remaining Experiments at Once

Student Information

FilesExpand file tree

OPTION_B_PLAN.md

Latest commit

History

OPTION_B_PLAN.md

File metadata and controls

Option B: Focused DistilBERT Exploration

Summary

Baseline Results ✓

Remaining Experiments

Phase 2: Learning Rate Exploration (4 experiments)

Phase 3: Batch Size Exploration (3 experiments)

Phase 4: Training Duration (3 experiments)

Total Time Estimate

What Happens After All Experiments Complete

1. You Tell Me "Experiments Done"

2. Demo & Presentation

Expected Best Configuration

Academic References (10 Papers)

Running All Remaining Experiments at Once

Student Information