AFDiT: flow-guided transformer diffusion for structure-aware virtual try-on

Abstract

Virtual try-on aims to generate photorealistic images of a person wearing a target garment, requiring precise garment–body alignment, fine-grained texture preservation, and robustness to pose variations. We propose AFDiT, a structure-aware virtual try-on framework that integrates appearance flow estimation with transformer-based diffusion generation. Specifically, we design an appearance flow encoder–decoder to predict dense garment deformation and alignment, and a parsing-guided mask fusion strategy to refine inpainting regions while preserving uncovered body parts. For high-quality synthesis, we introduce a warped garment-guided stable diffusion pipeline, injecting garment-specific CLIP embeddings and flow-guided warped garment features into a transformer-based diffusion model. Extensive experiments on VITON-HD demonstrate that AFDiT outperforms state-of-the-art methods in both paired and unpaired settings, achieving superior alignment, realism, and generalization to diverse poses and garments.

Environment

conda create -n afdit python=3.12 -y
conda activate afdit
pip install -r requirements.txt  # for CUDA 12.4

Training

Modify your parameters in the shell script files (*.sh), e.g., train_aflow_enc_dec.sh and train_viton.sh, to adjust hyperparameters or dataset paths.

bash train_aflow_enc_dec.sh
bash train_viton.sh

Acknowledgement

Our code is based on the implementation of FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on [Paper].

Citation

Our work is now available online. If you find our work helpful for your research, please consider citing it.

@article{Ding2025afdit,
  author    = {Huiming Ding and Beining Wu and Mengtian Li and Zhifeng Xie},
  title     = {AFDiT: flow-guided transformer diffusion for structure-aware virtual try-on},
  journal   = {The Visual Computer},
  year      = {2025},
  month     = {jun},
  day       = {29},
  eissn     = {1432-2315},
  issn      = {0178-2789},
  doi       = {10.1007/s00371-025-04075-5},
  url       = {https://doi.org/10.1007/s00371-025-04075-5},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
preprocess		preprocess
src		src
LICENSE		LICENSE
README.md		README.md
clean.sh		clean.sh
requirements.txt		requirements.txt
train_aflow_enc_dec.py		train_aflow_enc_dec.py
train_aflow_enc_dec.sh		train_aflow_enc_dec.sh
train_viton.py		train_viton.py
train_viton.sh		train_viton.sh
utils.py		utils.py
vitonhd_dataset.py		vitonhd_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AFDiT: flow-guided transformer diffusion for structure-aware virtual try-on

Abstract

Environment

Training

Acknowledgement

Citation

Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

MTKSHU/AFDiT

Folders and files

Latest commit

History

Repository files navigation

AFDiT: flow-guided transformer diffusion for structure-aware virtual try-on

Abstract

Environment

Training

Acknowledgement

Citation

Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages