Team: Engineers — Đào Nguyên Dương, Nguyễn Xuân Thành (https://github.com/xt2201)
Competition: https://vlsp.org.vn/vlsp2025/eval/tempqa
Paper: Retrieval-Guided Fine-tuning for Vietnamese Event Duration Question Answering (Duong Nguyen Dao, Thanh Xuan Nguyen)
Link paper: https://aclanthology.org/2025.vlsp-1.37/
Presentation: https://www.youtube.com/watch?v=-klcvd7Ak8Q
Please refer to our paper for more details.
Our approach combines retrieval-based fine-tuning of large language models (Qwen2.5 and Qwen3) to enhance temporal reasoning and duration understanding in question answering tasks.
Training is conducted using these scripts, each corresponding to a specific configuration:
train_qwen2.5.py— Fine-tune Qwen2.5, output LoRA weights saved insaved_dir/.train_qwen3_prompt2.py— Fine-tune Qwen3 using an List output format prompt, weights saved insaved_dir_qwen3_prompt2/.train_qwen3_prompt2_aug.py— Fine-tune Qwen3 with cross-lingual data augmentation from UDST-DurationQA, weights saved insaved_dir_qwen3_prompt2_aug/.
Each model is trained independently.
Run retrieval scripts to compute similarity scores:
retrieval.py→ for public testretrieval_private.py→ for private test
These scripts produce:
similarities.npy,similarities_private.npy— Similarity matrices between test and training samples.results_retrieval.json,results_retrieval_private.json— Retrieved few-shot examples.
Perform inference using five configurations:
| Script | Description |
|---|---|
infer_qwen2.5.py |
Inference with the fine-tuned Qwen2.5 model. |
infer_qwen3_prompt2.py |
Inference with the fine-tuned Qwen3 model. |
infer_qwen3_retrieval_prompt2.py |
Inference with fine-tuned Qwen3 and retrieved few-shot samples. |
infer_qwen3_prompt2_aug.py |
Inference with fine-tuned Qwen3 using UDST-DurationQA augmentation. |
infer_qwen3_retrieval_prompt2_aug.py |
Inference with fine-tuned Qwen3 using augmentation + retrieval-based few-shot samples. |
Output files are saved under submit/inference_*.csv.
- Run
submit_*.pyscripts to clean inference results and produce final submission files (results_*.txt). The team only stored results from the public test runs and did not save results from the private test. - To submit on AIHub, zip the final result:
zip -r results.zip results.txt
If you find our work helpful, feel free to give us a cite.
@inproceedings{dao-nguyen-2025-retrieval,
title = "Retrieval-Guided Fine-tuning for {V}ietnamese Event Duration Question Answering",
author = "Dao, Duong Nguyen and
Nguyen, Thanh Xuan",
editor = "Mai, Luong Chi and
Huyen, Nguyen Thi Minh and
Trang, Nguyen Thi Thu",
booktitle = "Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing",
month = oct,
year = "2025",
address = "Hanoi, Vietnam",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.vlsp-1.37/",
pages = "306--315"
}