Skip to content

Latest commit

 

History

History
223 lines (161 loc) · 5.44 KB

File metadata and controls

223 lines (161 loc) · 5.44 KB

Option B: Focused DistilBERT Exploration

Student: Martynas Prascevicius Student ID: 001263199 Course: COMP1818 Artificial Intelligence Applications


Summary

Total Experiments: 11 Training Time: ~8-10 hours (Mac M4) Focus: Core hyperparameter exploration for 4-page report


Baseline Results ✓

Experiment: baseline_default

  • Accuracy: 90.77%
  • Precision: 91.76%
  • Recall: 89.58%
  • F1: 90.66%
  • Training Time: 84 minutes
  • Best Epoch: 2 (Val Acc: 91.02%)

Remaining Experiments

Phase 2: Learning Rate Exploration (4 experiments)

Why Important: Learning rate is the MOST critical hyperparameter for transformer fine-tuning.

Experiment LR Expected Accuracy Time
lr_1e5 1e-5 ~90.0% (too conservative) 84 min
lr_2e5 2e-5 90.77% (baseline) 84 min
lr_3e5 3e-5 ~91.5% (likely best) 84 min
lr_5e5 5e-5 ~89.5% (too aggressive) 84 min

Total Phase 2 Time: ~5.6 hours

Run Command:

cd /Users/m2000uk/Desktop/coding/AI
source venv/bin/activate
cd CW2
python3 src/experiment_runner.py --phase 2

Phase 3: Batch Size Exploration (3 experiments)

Why Important: Batch size affects training speed, memory usage, and generalization.

Experiment Batch Size Expected Accuracy Time
batch_8 8 ~90.5% (slower, noisier) 140 min
batch_16 16 90.77% (baseline) 84 min
batch_32 32 ~91.0% (faster, smoother) 50 min

Total Phase 3 Time: ~4.5 hours

Run Command:

python3 src/experiment_runner.py --phase 3

Phase 4: Training Duration (3 experiments)

Why Important: Determines optimal training length vs. overfitting.

Experiment Epochs Expected Accuracy Time
epochs_3 3 90.77% (baseline) 84 min
epochs_4 4 ~91.2% (optimal) 112 min
epochs_5 5 ~90.8% (overfitting) 140 min

Total Phase 4 Time: ~5.5 hours

Run Command:

python3 src/experiment_runner.py --phase 4

Total Time Estimate

  • ✓ Phase 1: 84 min (DONE)
  • Phase 2: 336 min (~5.6 hours)
  • Phase 3: 274 min (~4.5 hours)
  • Phase 4: 336 min (~5.5 hours)

Grand Total: ~16 hours


What Happens After All Experiments Complete

1. You Tell Me "Experiments Done"

I will:

Analyze Results

  • Read all 11 results/*.json files
  • Compare all experiments
  • Find best configuration
  • Calculate improvements over baseline

Generate Visualizations (programmatically)

  • Figure 1: Learning rate impact on accuracy
  • Figure 2: Batch size comparison
  • Figure 3: Training duration analysis
  • Figure 4: Training curves for best model
  • Figure 5: Confusion matrix for best model
  • Table 1: All 11 experiments comparison
  • Table 2: Best configuration details

Write 4-Page LaTeX Report

  • Abstract
  • Introduction (DistilBERT architecture)
  • Background (Transformer attention mechanism)
  • Methodology (4 phases, 11 experiments)
  • Results (analysis of all phases)
  • Discussion (insights with literature citations)
  • Conclusion (best config, future work)
  • References (10 papers)

Create Overleaf Package

  • Prascevicius_Martynas_DistilBERT.tex
  • COMPXXXX.cls and COMPXXXX.bst
  • references_distilbert.bib
  • All 5 figures as PDFs
  • Ready to upload and compile

2. Demo & Presentation

I will create:

Demo Script (demo_inference.py)

  • Load best trained model
  • Run inference on sample reviews
  • Show predictions and explanations
  • Quick (~2-3 minutes)

Presentation Guide (PRESENTATION_GUIDE.md)

  • 5-minute video recording script
  • Slide-by-slide talking points
  • Key results to highlight
  • Demo execution steps
  • Q&A preparation

Expected Best Configuration

Based on literature and your baseline:

Predicted Optimal Settings:

  • Learning Rate: 3e-5 (Devlin et al. 2018 recommendation)
  • Batch Size: 32 (smoother gradients, faster)
  • Epochs: 4 (balance training vs overfitting)

Expected Accuracy: 91.5-92.0% (improvement of ~1% over baseline)


Academic References (10 Papers)

All cited in literature/references_distilbert.bib:

  1. Vaswani et al. (2017) - Attention is All You Need
  2. Devlin et al. (2018) - BERT
  3. Sanh et al. (2019) - DistilBERT
  4. Maas et al. (2011) - IMDB dataset
  5. Howard & Ruder (2018) - ULMFiT
  6. Liu et al. (2019) - RoBERTa
  7. Loshchilov & Hutter (2017) - AdamW
  8. Sun et al. (2019) - BERT fine-tuning
  9. Reimers & Gurevych (2019) - Sentence-BERT
  10. Smith et al. (2017) - Batch size effects

Running All Remaining Experiments at Once

If you want to run all phases overnight:

cd /Users/m2000uk/Desktop/coding/AI
source venv/bin/activate
cd CW2

# Run phases 2-4 sequentially
python3 src/experiment_runner.py --phase 2
python3 src/experiment_runner.py --phase 3
python3 src/experiment_runner.py --phase 4

# Or create a script to run all
echo "python3 src/experiment_runner.py --phase 2" > run_all.sh
echo "python3 src/experiment_runner.py --phase 3" >> run_all.sh
echo "python3 src/experiment_runner.py --phase 4" >> run_all.sh
chmod +x run_all.sh
./run_all.sh

WARNING: This takes ~16 hours. Run overnight!


Student Information

Name: Martynas Prascevicius Student ID: 001263199 University: University of Greenwich (2025-26) Deadline: Nov 19, 2025, 5pm (Grace: Nov 21, 2025, 5pm)