VideoVision

High-performance video upscaling inference wrapper for Real-ESRGAN. Optimizes throughput by combining deep learning inference with dense optical flow warping.

Includes a demo clip in inputs/onepiece_demo.mp4 for immediate testing.

Optimization Strategy

Standard upscalers run heavy model inference on every frame. VideoVision reduces computational load via:

Optical Flow Warping: Reuses high-resolution features from previous frames by calculating pixel motion (dense optical flow) and warping the result. Reduces GPU load significantly.
Scene Change Detection: Monitors frame difference histograms. Automatically forces a full inference refresh when a scene cut is detected to prevent ghosting artifacts.
Variable Inference Intervals: Configurable keyframe ratios allowing users to trade temporal stability for raw throughput (e.g., infer 1 frame, warp 3 frames).

%%{init: {'theme': 'base', 'themeVariables': { 'darkMode': true, 'fontFamily': 'arial', 'primaryColor': '#000', 'textColor': '#fff', 'lineColor': '#fff', 'signalColor': '#fff', 'actorBkg': '#000', 'actorBorder': '#fff', 'noteBkg': '#222', 'noteBorder': '#fff'}}}%%
sequenceDiagram
    autonumber
    
    participant In as Video Input
    participant Brain as 🧠 The Logic
    participant GPU as 🔴 AI Engine
    participant CPU as 🟢 Warp Engine
    participant Out as Output File

    In->>Brain: Read Next Frame
    
    Note right of Brain: 1. Calculate Diff Score<br/>2. Check Keyframe Timer

    alt High Quality Needed
        Brain->>GPU: Send Raw Frame
        GPU-->>Brain: Return Clean Upscale
    else Optimization Mode
        Brain->>CPU: Send Previous Frame
        CPU-->>Brain: Return Warped Frame
    end

    Brain->>Out: Write to MP4
    Brain->>Brain: Update History Buffer

Usage

The interface simplifies model selection and tiling parameters.

Anime (Balanced Speed/Quality) Runs inference every 2nd frame, warps intermediate frames.

python videovision.py -i inputs/onepiece_demo.mp4 --mode anime --speed balanced

High Throughput Runs inference every 4th frame. Best for limited hardware or high-framerate source material.

python videovision.py -i inputs/onepiece_demo.mp4 --mode anime --speed fastest

General / Live Action (Max Quality) Disables warping. Runs inference on every frame. Includes GFPGAN face enhancement.

python videovision.py -i inputs/my_vlog.mp4 --mode general --face_enhance --speed slow

Configuration

Argument	Options	Description
`--mode`	`anime`, `general`	Selects appropriate Real-ESRGAN checkpoint.
`--speed`	`slow`	Full inference every frame. No warping.
	`balanced`	Inference every 2nd frame. 2x theoretical throughput.
	`fastest`	Inference every 4th frame. 4x theoretical throughput.
`-s`	`2`, `4`	Upscaling factor.
`-t`	`0`, `400`, `256`	Tile size. Lower this value to reduce VRAM usage.
`--face_enhance`	Flag	Enables GFPGAN. Only available in `general` mode.

Installation

Requires standard Real-ESRGAN dependencies.

pip install -r requirements.txt
python setup.py develop

Credits

Wrapper around Real-ESRGAN. Optical flow implementation uses OpenCV Farneback algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
inputs		inputs
results		results
.gitignore		.gitignore
README.md		README.md
videovision.py		videovision.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VideoVision

Optimization Strategy

Usage

Configuration

Installation

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VideoVision

Optimization Strategy

Usage

Configuration

Installation

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages