[CVPR 2026] UPLiFT: Universal Pixel-dense Lightweight Feature Transforms

This is the official code for "UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders" a lightweight method to upscale the features of pretrained backbones to create pixel-dense features. This repository includes sample code to run pretrained UPLiFT models for several backbones, and training code to create UPLiFT models for new backbones.

Paper: https://arxiv.org/abs/2601.17950

Website: https://www.cs.umd.edu/~mwalmer/uplift/

Updates

4/20/26: UPLiFT Fast Mode now released! We’ve added several performance optimizations to further accelerate our existing UPLiFT models while also reducing memory usage. See details below.
2/21/26: We’re happy to announce that UPLiFT has been accepted to CVPR 2026!
2/1/26: Extra running options added, see details below.
1/25/26: Initial release of UPLiFT!

Installation

First, create and activate a conda environment:

conda create --name uplift python=3.12
conda activate uplift

Then install UPLiFT with the dependencies for your desired backbone:

Option 1: Clone and install

git clone https://github.com/mwalmer-umd/UPLiFT.git
cd UPLiFT
pip install -e '.[vit]'       # for DINOv2/DINOv3
# or: pip install -e '.[sd-vae]' for Stable Diffusion VAE
# or: pip install -e '.[all]'    for all backbones

Option 2: Install from GitHub

pip install 'uplift[vit] @ git+https://github.com/mwalmer-umd/UPLiFT.git'
# or: pip install 'uplift[sd-vae] @ git+https://github.com/mwalmer-umd/UPLiFT.git'
# or: pip install 'uplift[all] @ git+https://github.com/mwalmer-umd/UPLiFT.git'

Quick Start

import torch
from PIL import Image

# Load model (weights auto-download from HuggingFace)
model = torch.hub.load('mwalmer-umd/UPLiFT', 'uplift_dinov2_s14')

# Run inference
image = Image.open('image.jpg')
features = model(image)

Available Models

Model	Backbone	Load with
DINOv2-S/14	ViT	`uplift_dinov2_s14`
DINOv3-S+/16	ViT	`uplift_dinov3_splus16`
SD 1.5 VAE	Diffusion	`uplift_sd15_vae`

Fast Mode

Enable Fast Mode to activate several optimizations that increase UPLiFT’s speed and reduce its memory usage. The final outputs will be nearly identical to the results running without Fast Mode, and we find that performance in downstream tasks is also nearly identical. Note that the first call with Fast Mode will take slightly longer for compilation, but subsequent runs will use cached kernels. See FAST_MODE.md for more details.

PyTorch Hub usage:

model = torch.hub.load('mwalmer-umd/UPLiFT', 'uplift_dinov2_s14', fast=True)
features = model(image)

Command-line usage:

python sample_inference.py --pretrained uplift_dinov2-s14 --image img.png --fast

More Options

# Raw model only (no backbone)
model = torch.hub.load('mwalmer-umd/UPLiFT', 'uplift_dinov2_s14', include_extractor=False)

# Custom iterations
model = torch.hub.load('mwalmer-umd/UPLiFT', 'uplift_dinov2_s14', iters=2)

# Activate lower memory mode for Local Attender, using serial neighborhood pooling instead of parallel pooling
model = torch.hub.load('mwalmer-umd/UPLiFT', 'uplift_dinov2_s14', iters=4, low_mem=True)

Inference with Pretrained Models

Weights are automatically downloaded from HuggingFace when using torch.hub.load() or the load_model() function contained in uplift/hub_loader.py. In addition, we provide sample_inference.py, which can also be used to quickly run pretrained models or new models you train. For example:

Extract pixel-dense features with a pretrained UPLiFT for DINOv3-S+/16:

python sample_inference.py --pretrained uplift_dinov3-splus16 --image imgs/Gigi_1_512.png --iters 4

Extract pixel-dense features with a pretrained UPLiFT for DINOv2-S/14, using a forced output size:

python sample_inference.py --pretrained uplift_dinov2-s14 --image imgs/Gigi_2_448.png --iters 4 --outsize 448

Upsample an image with a pretrained UPLiFT trained for the SD1.5 VAE backbone:

python sample_inference.py --pretrained uplift_sd1.5vae --image imgs/Gigi_3_512.png --iters 2

Try enabling low-memory mode, which sacrifices some speed for lower max memory usage. The model will give equivalent outputs.

python sample_inference.py --pretrained uplift_dinov3-splus16 --image imgs/Gigi_1_512.png --iters 4 --low_mem

If you train a new UPLiFT model for an existing supported backbone or a new backbone, you can manually specify the path to the config and ckpt for it and run inference as follows:

python sample_inference.py --config path/to/config.yaml --ckpt path/to/checkpoint.pth --image your_image.png --iters 4

Training an UPLiFT Model

Before training, update ./uplift/datasets/datasets_helper.py to specify the path(s) to your training dataset(s).

Config files are used to specify the UPLiFT architecture, the feature extracting backbone, and the training settings. Example config files can be found in ./uplift/configs/. To train UPLiFT for a new model, create or modify an existing config file for the new backbone.

This repository includes two built-in methods for loading backbones. The first is in ./uplift/extractors/vit_wrapper.py which uses timm for model loading. The second is in ./uplift/extractors/diff_extractor.py which can load Diffusers pipelines from Hugging Face. Note that some pipelines may not be compatible with this wrapper. If so, the wrapper must be modified to appropriately run the VAE encoder and decoder elements of your specified pipeline. For other models, we recommend implementing an extractor wrapper similar to the two examples provided.

Once you have prepared the dataset, backbone, and config file, you can launch training with train_uplift.py. For example, the following command can be used to train an UPLiFT model from scratch with an existing sample config file:

python -m uplift.train_uplift --config uplift/configs/uplift_dinov2-s14.yaml

Evaluations

We follow the evaluation protocols of JAFAR and FM-Boost. Additional evaluation scripts will be provided in the near future.

Acknowledgements

This work was made possible thanks to code provided by the following sources:

https://github.com/Jiawei-Yang/Denoising-ViT for uplift/extractors/vit_wrapper.py
https://gist.github.com/sayakpaul/3ae0f847001d342af27018a96f467e4e and https://github.com/huggingface/diffusers/ for resources used in uplift/extractors/diff_extractor.py
https://github.com/PaulCouairon/JAFAR) for evaluation and PCA visualization code
https://github.com/CompVis/fm-boosting for evaluation code
https://github.com/facebookresearch/ConvNeXt for LayerNorm
https://gist.github.com/andrewjong/6b02ff237533b3b2c554701fb53d5c4d for data loading resources

License

Distributed under the MIT License.

Citation

If you found UPLiFT useful, please cite our paper with the following:

@article{walmer2026uplift,
  title={UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders},
  author={Walmer, Matthew and Suri, Saksham and Aggarwal, Anirud and Shrivastava, Abhinav},
  journal={arXiv preprint arXiv:2601.17950},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
imgs		imgs
licenses		licenses
uplift		uplift
.gitignore		.gitignore
FAST_MODE.md		FAST_MODE.md
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sample_inference.py		sample_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR 2026] UPLiFT: Universal Pixel-dense Lightweight Feature Transforms

Updates

Installation

Quick Start

Available Models

Fast Mode

More Options

Inference with Pretrained Models

Training an UPLiFT Model

Evaluations

Acknowledgements

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[CVPR 2026] UPLiFT: Universal Pixel-dense Lightweight Feature Transforms

Updates

Installation

Quick Start

Available Models

Fast Mode

More Options

Inference with Pretrained Models

Training an UPLiFT Model

Evaluations

Acknowledgements

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages