Generating Fit Check Videos with a Handheld Camera

Bowei Chen Brian Curless Ira Kemelmacher-Shlizerman Steven M. Seitz

University of Washington

Given two static mirror selfies (front and back) and motion data captured with a handheld mobile device, we synthesize a full-body video with a new scene background and consistent lighting. Our method introduces (1) a parameter-free frame generation strategy for video diffusion models, (2) a multi-reference attention mechanism to integrate appearance from both front and back photos, and (3) an image-based fine-tuning strategy to enhance sharpness and improve shadow/reflection rendering.

Installation

The code can be run under Python 3.11, PyTorch 2.3.1, and CUDA 11.8.

We use uv for dependency management. Install uv if you don't have it:

pip install uv

Then install all dependencies with one command:

uv sync

Activate the environment:

source .venv/bin/activate

Download Pretrained Models

Download the SVD base model:

git lfs install
git clone https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1

Download our model checkpoint and DWPose checkpoints into ckpt/:

huggingface-cli download boweiche/fit-check-videogen checkpoint-220000.pth --local-dir ckpt
huggingface-cli download boweiche/fit-check-videogen DWPose/yolox_l.onnx DWPose/dw-ll_ucoco_384.onnx --local-dir ckpt

Inference

Prepare Inputs

The repo ships with a sample test case:

data/
├── ref_videos/
│   └── demo/
│       ├── front.jpg          # front-facing mirror selfie
│       └── back.jpg           # back-facing mirror selfie
├── motions/
│   └── demo/
│       ├── dwpose/            # per-frame pose files (.npy + .png)
│       └── rgb/               # per-frame RGB images (.jpg), used with --motion_rgb
└── backgrounds/
    └── demo.jpg

To use your own data, place front/back selfies under data/ref_videos/<name>/, motion data under data/motions/<name>/, and a background image under data/backgrounds/. Then update inference.py with the corresponding names.

Run

Using pre-extracted DWPose files:

CUDA_VISIBLE_DEVICES=0 OMP_NUM_THREADS=8 python inference.py --inference_config configs/test.yaml \
    --motion_name demo --ref_video_name demo --background_name demo

Running DWPose detection on motion RGB frames at inference time:

CUDA_VISIBLE_DEVICES=0 OMP_NUM_THREADS=8 python inference.py --inference_config configs/test.yaml \
    --motion_name demo --ref_video_name demo --background_name demo --motion_rgb

Results are saved to results/ and include out.mp4, reference image visualizations, and pose visualizations.

Note: The released code does not include the face refinement step described in the paper. The output videos may have lower face quality compared to the paper results.

Acknowledgement

This codebase is adapted from MimicMotion and stable-video-diffusion.

Citation

If you find our work useful for your research, please consider citing the paper:

@article{chen2025fitcheck,
  title={Generating Fit Check Videos with a Handheld Camera},
  author={Chen, Bowei and Curless, Brian and Kemelmacher-Shlizerman, Ira and Seitz, Steven M.},
  journal={arXiv preprint arXiv:2505.23886},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
mimicmotion		mimicmotion
utils		utils
.gitignore		.gitignore
README.md		README.md
constants.py		constants.py
inference.py		inference.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generating Fit Check Videos with a Handheld Camera

Installation

Download Pretrained Models

Inference

Prepare Inputs

Run

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generating Fit Check Videos with a Handheld Camera

Installation

Download Pretrained Models

Inference

Prepare Inputs

Run

Acknowledgement

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages