Skip to content

coder-msk/Virtual-Vogue---AI-Powered-Virtual-Try-on

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Virtual Vogue

Virtual Vogue is an AI-powered virtual try-on platform: given a person photo and a garment image, it synthesizes a realistic preview of the outfit on the user. The stack pairs a Next.js storefront and fitting-room UI with a diffusion-based try-on backend built around Stable Diffusion XL, IP-Adapter, and inpainting, with pose and body cues from OpenPose, human parsing, and DensePose (as used in the bundled Gradio demo).


Demo & training resources

Resource Link
Demo video Google Drive – Virtual Vogue demo
Training code (archive) Google Drive – training materials

Screenshots

Live deployment shown below: thevirtualvogue.vercel.app. All assets live in docs/screenshots/.

What each screen shows

# File Description
1 01-welcome.png Landing page — hero with campaign-style imagery, nav (Home, About, Guide), profile avatar, and the four entry tiles: Mens Tops, Mens Bottoms, Women Tops, Women Bottoms. Edit Catalog is available for admins.
2 02-catalog-female-bottoms.png Category catalog — example: Female Apparel – Bottoms with Upload Garment (dashed drop zone) and a carousel of preset items (e.g. straight-fit jeans, wide-leg cargo, slouchy pants).
3 03-try-on-review-womens.png Try-on review (women’s flow) — side-by-side garment (wide-leg trousers) and person image before generation; Go Back and the circular generate control.
4 04-generated-outfit-womens.png Result (women’s)Generated Outfit view with zoom hint, short explanation, Download Image, and Back to Home.
5 05-try-on-review-mens.png Try-on review (men’s flow)garment (patterned tee) and person (model in neutral basics) before generation.
6 06-generated-outfit-mens.png Result (men’s) — same Generated Outfit layout with the male try-on output, download, and home actions.

Gallery (user journey)

1 — Home
Virtual Vogue landing page with category tiles

2 — Catalog & upload · 3 — Review inputs (women)
Female bottoms catalog with upload and carousel   Women try-on review garment and model side by side

4 — Generated outfit (women) · 5 — Review inputs (men)
Women generated outfit result with download   Men try-on review tee and model

6 — Generated outfit (men)
Men generated outfit result with download


Features

  • Web app (Frontend/): Next.js 15, React 18, Tailwind CSS, Radix UI; Auth0 sign-in; MongoDB for admin-added products; Cloudinary for image uploads; Zustand for try-on flow state (garment/person images, category, denoise steps, seed, etc.).
  • Virtual try-on (Backend/): Custom SDXL inpainting pipeline (src/tryon_pipeline.py) with hacked UNet blocks (src/unet_hacked_tryon.py, src/unet_hacked_garmnet.py, attention/transformer modules) and IP-Adapter code under ip_adapter/.
  • Offline / research: Batch inference on VITON-HD (inference.py) and DressCode (inference_dc.py); fine-tuning via train_xl.py + train_xl.sh; Jupyter workflow in Notebooks/Inference.ipynb.
  • Local UI for the model: Gradio app in Backend/gradio_demo/app.py (loads weights from Hugging Face yisol/IDM-VTON).

Repository layout

Path Purpose
Frontend/ Next.js app: pages, API routes (app/api/), Auth0, MongoDB, Cloudinary
Backend/src/ Try-on pipeline, UNet / attention / transformer implementations
Backend/ip_adapter/ IP-Adapter integration used by training and inference
Backend/inference.py VITON-HD test inference (paired / unpaired); see inference.sh
Backend/inference_dc.py DressCode inference by category (upper_body, lower_body, dresses)
Backend/train_xl.py SDXL-style training script; train_xl.sh example launcher
Backend/gradio_demo/ Gradio demo (app.py), DensePose / parsing / Detectron2 helpers
Backend/preprocess/ Human parsing, OpenPose, and related preprocessing
Backend/util/ Shared image / pipeline utilities
Backend/vitonhd_*_tagged.json Annotation JSON used by the VITON-HD dataset loader in training
Notebooks/Inference.ipynb Notebook for experimentation / inference
Backend/environment.yaml Conda env (Python 3.10, PyTorch 2.0.1, CUDA 11.8)
Backend/requirements.txt Pip dependencies for alternative installs

How the web app talks to the model

  • Try-on generation (Frontend/app/final-image/page.tsx) sends a multipart/form-data POST to NEXT_PUBLIC_API_URL with fields: garment_image, human_image, garment_description, category, denoise_steps, seed, number_of_images. It expects JSON shaped like { "images": ["<base64 png>"] }.
  • Admin product API (e.g. Frontend/admin/components/product-form.tsx) uses NEXT_PUBLIC_BACKENDURL for /api/upload and /api/products (separate backend deployment).

This repository does not ship the HTTP server that implements NEXT_PUBLIC_API_URL; you can wrap the Gradio pipeline logic or inference.py in your own API, or run the Gradio demo locally for interactive use.


Prerequisites

  • Frontend: Node.js 18+ (recommended for Next.js 15).
  • Backend: Kaggle GPU strongly recommended; CUDA-compatible PyTorch. Conda is recommended using Backend/environment.yaml.

Run the frontend (Next.js)

cd Frontend
npm install

Create a .env.local in Frontend/ with at least:

# Auth0 (Next.js Auth0 SDK)
AUTH0_SECRET=...
AUTH0_BASE_URL=http://localhost:3000
AUTH0_ISSUER_BASE_URL=https://YOUR_DOMAIN.auth0.com
AUTH0_CLIENT_ID=...
AUTH0_CLIENT_SECRET=...

# MongoDB (product DB)
MONGODB_URI=mongodb+srv://...

# Cloudinary (admin image uploads)
CLOUDINARY_CLOUD_NAME=...
CLOUDINARY_API_KEY=...
CLOUDINARY_API_SECRET=...

# Try-on inference API (your deployed endpoint)
NEXT_PUBLIC_API_URL=https://your-inference-api.example.com/tryon

# Optional: separate backend for admin CRUD
NEXT_PUBLIC_BACKENDURL=https://your-backend.example.com
npm run dev

Open http://localhost:3000. Production: npm run build then npm run start.


Run the backend (conda + pip)

From the repo root:

cd Backend
conda env create -f environment.yaml
conda activate virtual-vogue
pip install -r requirements.txt

Additional setup (see also Backend/README.md):

  • IP-Adapter weights: clone h94/IP-Adapter and place assets under Backend/ckpt/ as expected by your training config.
  • DensePose: the Gradio app expects a DensePose checkpoint at Backend/gradio_demo/ckpt/densepose/model_final_162be9.pkl (paths referenced in gradio_demo/app.py). Download the matching Detectron2 DensePose model and config under gradio_demo/configs/ as in the original IDM-VTON setup.

Gradio demo (interactive try-on)

Run from Backend/gradio_demo so imports and configs resolve:

cd Backend/gradio_demo
python app.py

The first run downloads yisol/IDM-VTON from Hugging Face. Use a CUDA device for practical performance (app.py uses GPU when available).

Batch inference (VITON-HD / DressCode)

Example commands are in Backend/inference.sh. Adjust --data_dir and batch settings to your machine:

cd Backend
# VITON-HD (paired) – example
accelerate launch inference.py \
  --pretrained_model_name_or_path "yisol/IDM-VTON" \
  --width 768 --height 1024 --num_inference_steps 30 \
  --output_dir "result" --data_dir "/path/to/zalando" \
  --seed 42 --test_batch_size 2 --guidance_scale 2.0

DressCode variants use inference_dc.py with --category upper_body|lower_body|dresses.

Training

Example (multi-GPU) from train_xl.sh:

cd Backend
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch train_xl.py \
  --gradient_checkpointing --use_8bit_adam \
  --output_dir=result --train_batch_size=6 \
  --data_dir=/path/to/VITON-HD/zalando

Point --data_dir at your prepared VITON-HD layout (including train / test splits and vitonhd_*_tagged.json style annotations as used in train_xl.py).

Notebook

Open Notebooks/Inference.ipynb in Jupyter or VS Code for an interactive inference workflow.


Tech stack summary

Frontend: Next.js, React, TypeScript, Tailwind CSS, Auth0, MongoDB, Cloudinary, Zustand, Axios.
Backend / ML: PyTorch, Diffusers, Transformers, Accelerate, PEFT, OpenCV, ONNX Runtime, Gradio; Detectron2 / DensePose / parsing pipelines under preprocess/ and gradio_demo/.


License

Root LICENSE is MIT (see file for copyright holder). Third-party code under Backend/detectron2, Backend/preprocess, and upstream model licenses (e.g. IDM-VTON, IP-Adapter) apply to those components.


Acknowledgements

This project builds on public research and codebases including IDM-VTON (yisol/IDM-VTON on Hugging Face), IP-Adapter, DensePose, Detectron2, and related virtual try-on and diffusion work. See Backend/README.md for additional credits.


Authors

  • Muhammad Salman Khan
  • Amna Khawaja
  • Ahsan Zahoor

About

Virtual Vogue is an AI-powered virtual try-on platform that generates realistic outfit previews from user and garment images using diffusion models (SDXL, IP-Adapter) with pose estimation and human parsing. Built with Next.js, PyTorch, and a custom ML pipeline for scalable fashion AI applications.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors