Virtual Vogue is an AI-powered virtual try-on platform: given a person photo and a garment image, it synthesizes a realistic preview of the outfit on the user. The stack pairs a Next.js storefront and fitting-room UI with a diffusion-based try-on backend built around Stable Diffusion XL, IP-Adapter, and inpainting, with pose and body cues from OpenPose, human parsing, and DensePose (as used in the bundled Gradio demo).
| Resource | Link |
|---|---|
| Demo video | Google Drive – Virtual Vogue demo |
| Training code (archive) | Google Drive – training materials |
Live deployment shown below: thevirtualvogue.vercel.app. All assets live in docs/screenshots/.
| # | File | Description |
|---|---|---|
| 1 | 01-welcome.png |
Landing page — hero with campaign-style imagery, nav (Home, About, Guide), profile avatar, and the four entry tiles: Mens Tops, Mens Bottoms, Women Tops, Women Bottoms. Edit Catalog is available for admins. |
| 2 | 02-catalog-female-bottoms.png |
Category catalog — example: Female Apparel – Bottoms with Upload Garment (dashed drop zone) and a carousel of preset items (e.g. straight-fit jeans, wide-leg cargo, slouchy pants). |
| 3 | 03-try-on-review-womens.png |
Try-on review (women’s flow) — side-by-side garment (wide-leg trousers) and person image before generation; Go Back and the circular generate control. |
| 4 | 04-generated-outfit-womens.png |
Result (women’s) — Generated Outfit view with zoom hint, short explanation, Download Image, and Back to Home. |
| 5 | 05-try-on-review-mens.png |
Try-on review (men’s flow) — garment (patterned tee) and person (model in neutral basics) before generation. |
| 6 | 06-generated-outfit-mens.png |
Result (men’s) — same Generated Outfit layout with the male try-on output, download, and home actions. |
2 — Catalog & upload · 3 — Review inputs (women)
4 — Generated outfit (women) · 5 — Review inputs (men)
- Web app (
Frontend/): Next.js 15, React 18, Tailwind CSS, Radix UI; Auth0 sign-in; MongoDB for admin-added products; Cloudinary for image uploads; Zustand for try-on flow state (garment/person images, category, denoise steps, seed, etc.). - Virtual try-on (
Backend/): Custom SDXL inpainting pipeline (src/tryon_pipeline.py) with hacked UNet blocks (src/unet_hacked_tryon.py,src/unet_hacked_garmnet.py, attention/transformer modules) and IP-Adapter code underip_adapter/. - Offline / research: Batch inference on VITON-HD (
inference.py) and DressCode (inference_dc.py); fine-tuning viatrain_xl.py+train_xl.sh; Jupyter workflow inNotebooks/Inference.ipynb. - Local UI for the model: Gradio app in
Backend/gradio_demo/app.py(loads weights from Hugging Faceyisol/IDM-VTON).
| Path | Purpose |
|---|---|
Frontend/ |
Next.js app: pages, API routes (app/api/), Auth0, MongoDB, Cloudinary |
Backend/src/ |
Try-on pipeline, UNet / attention / transformer implementations |
Backend/ip_adapter/ |
IP-Adapter integration used by training and inference |
Backend/inference.py |
VITON-HD test inference (paired / unpaired); see inference.sh |
Backend/inference_dc.py |
DressCode inference by category (upper_body, lower_body, dresses) |
Backend/train_xl.py |
SDXL-style training script; train_xl.sh example launcher |
Backend/gradio_demo/ |
Gradio demo (app.py), DensePose / parsing / Detectron2 helpers |
Backend/preprocess/ |
Human parsing, OpenPose, and related preprocessing |
Backend/util/ |
Shared image / pipeline utilities |
Backend/vitonhd_*_tagged.json |
Annotation JSON used by the VITON-HD dataset loader in training |
Notebooks/Inference.ipynb |
Notebook for experimentation / inference |
Backend/environment.yaml |
Conda env (Python 3.10, PyTorch 2.0.1, CUDA 11.8) |
Backend/requirements.txt |
Pip dependencies for alternative installs |
- Try-on generation (
Frontend/app/final-image/page.tsx) sends amultipart/form-dataPOSTtoNEXT_PUBLIC_API_URLwith fields:garment_image,human_image,garment_description,category,denoise_steps,seed,number_of_images. It expects JSON shaped like{ "images": ["<base64 png>"] }. - Admin product API (e.g.
Frontend/admin/components/product-form.tsx) usesNEXT_PUBLIC_BACKENDURLfor/api/uploadand/api/products(separate backend deployment).
This repository does not ship the HTTP server that implements NEXT_PUBLIC_API_URL; you can wrap the Gradio pipeline logic or inference.py in your own API, or run the Gradio demo locally for interactive use.
- Frontend: Node.js 18+ (recommended for Next.js 15).
- Backend: Kaggle GPU strongly recommended; CUDA-compatible PyTorch. Conda is recommended using
Backend/environment.yaml.
cd Frontend
npm installCreate a .env.local in Frontend/ with at least:
# Auth0 (Next.js Auth0 SDK)
AUTH0_SECRET=...
AUTH0_BASE_URL=http://localhost:3000
AUTH0_ISSUER_BASE_URL=https://YOUR_DOMAIN.auth0.com
AUTH0_CLIENT_ID=...
AUTH0_CLIENT_SECRET=...
# MongoDB (product DB)
MONGODB_URI=mongodb+srv://...
# Cloudinary (admin image uploads)
CLOUDINARY_CLOUD_NAME=...
CLOUDINARY_API_KEY=...
CLOUDINARY_API_SECRET=...
# Try-on inference API (your deployed endpoint)
NEXT_PUBLIC_API_URL=https://your-inference-api.example.com/tryon
# Optional: separate backend for admin CRUD
NEXT_PUBLIC_BACKENDURL=https://your-backend.example.comnpm run devOpen http://localhost:3000. Production: npm run build then npm run start.
From the repo root:
cd Backend
conda env create -f environment.yaml
conda activate virtual-vogue
pip install -r requirements.txtAdditional setup (see also Backend/README.md):
- IP-Adapter weights: clone h94/IP-Adapter and place assets under
Backend/ckpt/as expected by your training config. - DensePose: the Gradio app expects a DensePose checkpoint at
Backend/gradio_demo/ckpt/densepose/model_final_162be9.pkl(paths referenced ingradio_demo/app.py). Download the matching Detectron2 DensePose model and config undergradio_demo/configs/as in the original IDM-VTON setup.
Run from Backend/gradio_demo so imports and configs resolve:
cd Backend/gradio_demo
python app.pyThe first run downloads yisol/IDM-VTON from Hugging Face. Use a CUDA device for practical performance (app.py uses GPU when available).
Example commands are in Backend/inference.sh. Adjust --data_dir and batch settings to your machine:
cd Backend
# VITON-HD (paired) – example
accelerate launch inference.py \
--pretrained_model_name_or_path "yisol/IDM-VTON" \
--width 768 --height 1024 --num_inference_steps 30 \
--output_dir "result" --data_dir "/path/to/zalando" \
--seed 42 --test_batch_size 2 --guidance_scale 2.0DressCode variants use inference_dc.py with --category upper_body|lower_body|dresses.
Example (multi-GPU) from train_xl.sh:
cd Backend
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch train_xl.py \
--gradient_checkpointing --use_8bit_adam \
--output_dir=result --train_batch_size=6 \
--data_dir=/path/to/VITON-HD/zalandoPoint --data_dir at your prepared VITON-HD layout (including train / test splits and vitonhd_*_tagged.json style annotations as used in train_xl.py).
Open Notebooks/Inference.ipynb in Jupyter or VS Code for an interactive inference workflow.
Frontend: Next.js, React, TypeScript, Tailwind CSS, Auth0, MongoDB, Cloudinary, Zustand, Axios.
Backend / ML: PyTorch, Diffusers, Transformers, Accelerate, PEFT, OpenCV, ONNX Runtime, Gradio; Detectron2 / DensePose / parsing pipelines under preprocess/ and gradio_demo/.
Root LICENSE is MIT (see file for copyright holder). Third-party code under Backend/detectron2, Backend/preprocess, and upstream model licenses (e.g. IDM-VTON, IP-Adapter) apply to those components.
This project builds on public research and codebases including IDM-VTON (yisol/IDM-VTON on Hugging Face), IP-Adapter, DensePose, Detectron2, and related virtual try-on and diffusion work. See Backend/README.md for additional credits.
- Muhammad Salman Khan
- Amna Khawaja
- Ahsan Zahoor

