Smoothing in Distilled Feature Fields

Built on: distilled feature fields (DFFs) (Kobayashi et al. NeurIPS 2022). This is a simpler and faster demo codebase of distilled feature fields (DFFs) (Kobayashi et al. NeurIPS 2022).

NOTES:

For the 3 techniques that we tested in this project, we have 3 separate branches namely - total_variation, bilateral_filtering and sam_for_conv corresponding to total variation, bilateral filtering and the sam guided smoothing methods. The master branch contains the baseline code for DFFs
Each of the three branches is structured similarly. The train.py file in each of these branches contains sections of code responsible for adding regularization (TV and Bilateral) and performing smoothing (SAM guided). These can be found within the training_step() function, specifically in the feature_loss section.

Visualization of feature field before and after additional smoothing.

Outputs from editing operations:

video_sam.mp4

pveg_sam.mp4

pveg_sam_color.mp4

Setup

python -m pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1--index-url https://download.pytorch.org/whl/cu121
python -m pip install torch-scatter -f https://data.pyg.org/whl/torch-2.1.1+cu121.html

python -m pip install -r requirements.txt
python -m pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
git submodule update --init --recursive
cd apex && pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ && cd ..
python -m pip install models/csrc/

(Download a sample dataset)

Train

--root_dir is the dataset of images with poses.
--feature_directory is the dataset of feature maps for distillation. --feature_dim matches the dimension of them.

python train.py --root_dir sample_dataset --dataset_name colmap --exp_name exp_v1 --downsample 0.25 --num_epochs 4 --batch_size 4096 --scale 4.0 --ray_sampling_strategy same_image --feature_dim 512 --random_bg --feature_directory sample_dataset/rgb_feature_langseg

Render with Edit

Modify --edit_config or codebase itself for other editings.
Set --ckpt_path with the checkpoint above.

python render.py --root_dir sample_dataset --dataset_name colmap --downsample 0.25 --scale 4.0 --ray_sampling_strategy same_image --feature_dim 512 --ckpt_path ckpts/colmap/exp_v1_clip/epoch\=0_slim.ckpt --edit_config query.yaml
# ls ./renderd_*.png
# ffmpeg -framerate 30 -i ./rendered_%03d.png -vcodec libx264 -pix_fmt yuv420p -r 30 video.mp4

With New Scene

Prepare Posed Images

colmap

colmap feature_extractor --ImageReader.camera_model OPENCV --SiftExtraction.estimate_affine_shape=true --SiftExtraction.domain_size_pooling=true --ImageReader.single_camera 1 --database_path sample_dataset/database.db --image_path sample_dataset/images --SiftExtraction.use_gpu=false
colmap exhaustive_matcher --SiftMatching.guided_matching=true --database_path sample_dataset/database.db --SiftMatching.use_gpu=false
mkdir sample_dataset/sparse
colmap mapper --database_path sample_dataset/database.db --image_path sample_dataset/images --output_path sample_dataset/sparse
colmap bundle_adjuster --input_path sample_dataset/sparse/0 --output_path sample_dataset/sparse/0 --BundleAdjustment.refine_principal
_point 1
colmap image_undistorter --image_path sample_dataset/images --input_path sample_dataset/sparse/0 --output_path sample_dataset_undis
--output_type COLMAP

Encode Features by Teacher Network

Setup LSeg

cd distilled_feature_field/encoders/lseg_encoder
pip install -r requirements.txt
pip install git+https://github.com/zhanghang1989/PyTorch-Encoding/

Download the LSeg model file demo_e200.ckpt from the Google drive.

Encode and save

python -u encode_images.py --backbone clip_vitl16_384 --weights demo_e200.ckpt --widehead --no-scaleinv --outdir ../../sample_dataset_undis/rgb_feature_langseg --test-rgb-dir ../../sample_dataset_undis/images

This may produces large feature map files in --outdir (100-200MB per file).

Run train.py. If reconstruction fails, change --scale 4.0 to smaller or larger values, e.g., --scale 1.0 or --scale 16.0.

Citation

The codebase for this project is derived from DFFs

The codebase of NeRF is derived from ngp_pl (6b2a669, Aug 30 2022)

The codebase of encoders/lseg_encoder is derived from lang-seg

Name		Name	Last commit message	Last commit date
Latest commit History 232 Commits
.github		.github
apex @ 78be3a7		apex @ 78be3a7
assets		assets
benchmarking		benchmarking
datasets		datasets
edit_ops		edit_ops
encoders/lseg_encoder		encoders/lseg_encoder
misc		misc
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
GALLERY.md		GALLERY.md
LICENSE		LICENSE
README.md		README.md
clip_utils.py		clip_utils.py
commands.txt		commands.txt
losses.py		losses.py
metrics.py		metrics.py
opt.py		opt.py
query.yaml		query.yaml
render.py		render.py
requirements.txt		requirements.txt
show_gui.py		show_gui.py
test.ipynb		test.ipynb
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Smoothing in Distilled Feature Fields

NOTES:

With New Scene

Prepare Posed Images

Encode Features by Teacher Network

Citation

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Smoothing in Distilled Feature Fields

NOTES:

With New Scene

Prepare Posed Images

Encode Features by Teacher Network

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages