Visual Analytics and Transformers for Autonomous Generalized Evaluation
VANTAGE-X is a vehicle damage analysis project that combines fine-tuned YOLOv11-seg detection with optional Qwen2.5-VL reasoning on cropped damage regions. The goal is to provide a practical workflow for detecting visible vehicle damage, segmenting affected regions, and generating concise structured assessments.
VANTAGE-X provides:
- YOLOv11-seg inference for damage detection and instance masks
- Optional Qwen2.5-VL crop-level reasoning for severity, location, and short descriptions
- A CLI for single-image runs, batch processing, evaluation, and the Gradio app
- Data conversion utilities for VIA and COCO workflows
- Training and evaluation scripts for the VehiDE dataset
The current runtime pipeline is:
- Load an image.
- Run YOLOv11-seg to detect damage regions and generate masks.
- Optionally run Qwen2.5-VL on each cropped region.
- Save annotated images plus text, JSON, and Markdown reports.
.
├── app/ # Gradio UI
├── configs/ # Runtime configuration
├── data/
│ ├── data/ # Dataset loader and conversion scripts
│ └── yolo/ # YOLO dataset config template
├── evaluate/ # Evaluation utilities
├── models/ # Detector, VLM, and shared datatypes
├── pipeline/ # End-to-end damage pipeline
├── train/ # Training and COCO->YOLO conversion scripts
├── utils/ # Reporting and visualization helpers
├── main.py # Main CLI entry point
├── requirements.txt
└── README.md
Large local assets are intentionally ignored so the repository stays publishable:
- virtual environments
- raw dataset image folders
- COCO annotation JSON exports
- converted YOLO training data
- model weights and checkpoints
- generated training and evaluation outputs
If you need to reproduce training or inference, prepare the dataset locally, generate the YOLO dataset files, and place model weights on your machine.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRecommended:
- Python 3.10+
- CUDA-capable GPU for faster inference and training
- extra VRAM if you enable Qwen2.5-VL together with YOLO
Main settings live in configs/config.yaml.
Important values:
yolo.weights: path to the trained YOLO checkpointyolo.conf_threshold: default inference confidence thresholdyolo.iou_threshold: detector NMS IoU thresholdyolo.imgsz: inference image sizeqwen_vlm.run_vlm: enable or disable crop-level VLM analysispipeline.min_mask_fraction: discard tiny masks
The default config points to a trained checkpoint under results/training/.../best.pt, but that artifact is ignored from Git. On a fresh clone, you will need to train or supply weights locally.
python main.py run --image path/to/car.jpg --output results/python main.py batch --folder path/to/images --output results/batch/python main.py run --image path/to/car.jpg --no-vlmpython main.py app --port 7860python main.py evaluate --data-root data/data --split testfrom pipeline.damage_pipeline import DamagePipeline
pipeline = DamagePipeline(run_vlm=True)
result = pipeline.run("path/to/car.jpg")
print(result.to_report())
result.save_visualisation("output/car_result.jpg")A successful run can generate:
*_annotated.jpg*_report.txt*_report.json*_report.mdbatch_summary.jsonfor folder runs
python data/data/convert_via_to_coco.py \
--via-json data/data/Train_annotations.json \
--images-dir data/data/train \
--output data/data/Train_annotations_coco.jsonIf you run the script without arguments, it converts both the default train and test VIA files in data/data/.
python train/convert_coco_to_yolo.py --data-root data/data --out-dir data/yoloThis generates YOLO labels and updates data/yolo/dataset.yaml.
python train/train_yolo.py --epochs 100 --batch 16 --imgsz 640Useful options:
--resume--device cuda:0--project results/training--name yolo11seg_damage
The evaluation script reports:
mAPper_class_apmean_mask_ioudetection_accuracynum_evaluatednum_masks_evaluated
dentscratchbroken glasslost partspuncturedtornbroken lights