OmniShotCut is a sensitive and more informative SoTA for Shot Boundary Detection.
OmniShotCut can detect shot changes of the video in diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.), and recognize Sudden Jump and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer.
🔥 Update | 👀 Visualization | 🔧 Installation | ⚡ Inference | 💻 OmniShotCut Benchmark
- Release ArXiv paper
- Release the inference weights
- Release Gradio demo (with online)
- Release the benchmark
- Release the training code and curation
⭐ If you like OmniShotCut, please help ⭐⭐star⭐⭐ this repo. Thanks! 🤗
conda create -n OmniShotCut python=3.10
conda activate OmniShotCut
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
conda install ffmpeg # including conda deactivate case (for debug)First, let us download the checkpoint
mkdir checkpoints
cd checkpoints
wget https://huggingface.co/uva-cv-lab/OmniShotCut/resolve/main/OmniShotCut_ckpt.pthLocal Gradio can be created by simply running the following:
python app.py Click "Running on public URL".
We provide some modes for the inference. 'default' mode will shot the intra and inter label we define. However, we believe that most users might want the most direct results, which is the general shots without any transitions. To this end, please use '--mode clean_shot'.
Execute the inference by:
python test_code/inference.py --checkpoint_path XXX --input_video_path XXX --mode defaultWe will visualize the results by creating a folder named 'demo_video_results', where vertical bar with the same color refer to the same shot.
@article{wang2026omnishotcut,
title={OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer},
author={Wang, Boyang and Xu, Guangyi and Tang, Zhipeng and Zhang, Jiahui and Cheng, Zezhou},
journal={arXiv preprint arXiv:2604.24762},
year={2026}
}

