- LoRA (Low-Rank Adaptation): Injects trainable low-rank matrices into attention layers.
- BitFit: Only trains the bias terms in transformer layers.
- Prompt Tuning: Uses virtual token embeddings prepended to input sequences.
bert-base-uncasedroberta-basedistilbert-base-uncased
For Task 1, we measure accuracy per epoch, final test accuracy, and final test loss (Binary Cross-Entropy).
For Task 2, we measure validation accuracy over epochs and compare our results with existing models.
| Model Variant | Sharma et al. (2019) | Ours |
|---|---|---|
| LR with Trigrams | 80.8 | — |
| SVM with Trigrams | 80.9 | — |
| Random Forest | 75.7 | — |
| Gradient Boosting | 75.0 | — |
| CBOW | 83.4 | — |
| LSTM + Attention | 81.8 | — |
| BiLSTM + Attention | 82.3 | — |
| BERT + LoRA | — | 84.52 |
| BERT + BitFit | — | 83.63 |
| BERT + Prompt | — | 72.68 |
| RoBERTa + LoRA | — | 86.80 |
| RoBERTa + BitFit | — | 85.40 |
| RoBERTa + Prompt | — | 72.26 |
| DistilBERT + LoRA | — | 84.14 |
| DistilBERT + BitFit | — | 82.90 |
| DistilBERT + Prompt | — | 77.89 |
For Task 3, we only report F1 scores on the validation set, to illustrates how well the model perform on predicting the answer span.
| Model | Fine-Tuning | Best Result (F1-score) |
|---|---|---|
| BERT | LoRA | 0.7209 |
| BitFit | 0.6417 | |
| Prompt Tuning | 0.0388 | |
| RoBERTa | LoRA | 0.8346 |
| BitFit | 0.7970 | |
| Prompt Tuning | 0.0180 | |
| DistilBERT | LoRA | 0.7128 |
| BitFit | 0.5452 | |
| Prompt Tuning | 0.0186 |
- Datasets used:
- Experiments were conducted on Kaggle. If you want to test our notebooks please import them on Kaggle.





