Jiahe Li · Jiawei Zhang · Xiao Bai · Jin Zheng · Xiaohan Yu · Lin Gu · Gim Hee Lee
Paper | arXiv | Project Page
In CHALLENGING scenarios with ambiguous photometric constraints, previous methods lose the capability to identify the correct surfaces even under priors, leading to a noticeable performance drop with erroneous reconstructions. Instead, AmbiSuR stands out by delivering accurate geometry with delicate details.
- Install and activate the basic environment by
conda env create -f environment.ymlfollowing the reference configuration. PyTorch >= 2.0 is required for Depth Anything 3. pip install -r torch_depended_env.txtfor the PyTorch-related and customized CUDA components.- Install Depth Anything 3 by
pip install git+https://github.com/ByteDance-Seed/depth-anything-3.gitfor the default multi-view depth priors. (Optional)
Below go through the workflow for reconstruction from a scene capturing.
Principlely, this project is compatible with COLMAP format datasets. We recommend following Gaussian Splatting to handle the images captures.
Run bash scripts/run_da3.sh <dataset_dir> <max_points> <ransac_thresh> to preprocess the Depth Anything 3 priors aligned to the COLMAP data.
- If successfully,
sparse_da3_aligned/should be generated in the scene dir. - Chunk size of 450 is adopted for single 48GB GPU. The value can be set in
./multi_view_priors/estimate_colmap.pyif for other devices. - (This step can be skipped if other kinds of priors are expected.)
After this, point clouds from DA3 are priorized for initialization. Adjust the code if other strategies are required.
python train.py --source_path $DATA_PATH --model_path $OUTPUT_PATHpython mesh_extract/extract_adaptive.py --model_path $OUTPUT_PATHAll the results will be saved into the specified $OUTPUT_PATH including the following results:
mesh/: Output meshtsdf_fusion_post.plyand the evaluations.pg_view/: Visualization of the training progress. Useful for debugging.train/: Rendered mesh and visualizations from the training set.
The configuration is defined by command param and flags. Here we list some important hyperparameters for optimization:
- Photometric Disambiguation
-
--trunc_sigma 2.0to specify the threshold of Gaussian Primitive Truncation. Larger denotes less truncated. -
--ray_color_lambda 1e-5to specify the weight of Ray-Color Consistency regularization.
-
- Priors and SH Ambiguity Indicator
-
--depth_weight 0.1to weight the basic depth regularization term. Corresponds to the$\tau\mathcal{L}_{geo}$ in the paper. -
--sh_ambi_upper_ratio 0.95and--sh_ambi_upper_ratio 0.1denote the normalized percentile to select the two primitive sets for Dual-End Indication. -
--use_monoto enable the using of Depth Anything 2 monocular depth to alternate the default Depth Anything 3 metric depth. Corresponds to the AmbiSuR-Mono variant in the paper.
-
python mesh_extract/extract_<type>.py $OUTPUT_PATH--voxel_size 0.002to determine the resolution of the mesh. Note that a smaller voxel size requires more costs in RAM and storage.--sdf_trunc_scale 4.0to control the truncated TSDF multiplier based on voxel size. Increase if the mesh is unideally incomplete and decrease when inaccurate vertices remain.--max_depth 5.0to set the extraction bound of the scene. An example is inmesh_extract/extract_adaptive.pyto adaptively estimate the bound according to the training cameras.
- Rendering test views (if exist) with visualizations:
python extract_general.py --model_path $OUTPUT_PATH --skip_train
- Rendering reconstructed mesh at training views with open3d:
python render_mesh.py $OUTPUT_PATH- It only works after the mesh file has been extracted.
We provide experiment scripts and configurations in scripts/ to reproduce the experiments.
We use the preprocessed DTU dataset from 2DGS, the official Tanks and Temples dataset, and the official Mip-NeRF 360 dataset. Here are the instructions for each.
- DTU dataset (2DGS pre-processed)
- To get the ground-truths, you need also to download the Points.zip and SampleSet.zip.
- Run
bash scripts/run_da3.sh ./data/DTU_2dgs 50000 0.01for DA3 priors.
- Tanks and Temples dataset (Official)
- Ground truth, image set, camera poses, alignment, and cropfiles are required.
- Due to substantial inaccurate estimation existing, we recommend using the 2DGS pre-processed
Courthousefolder from here to replace the official data. - Run
python scripts/preprocess/convert_tnt.py --tnt_path <TnT_path>to process the scenes with COLMAP. - Run
bash scripts/run_da3.sh ./data/TnT 500000 0.05for DA3 priors.
- Mip-NeRF 360 dataset (Official)
The default dataset organizations under data/ are like this:
TnT
├─ Barn
│ ├─ Barn_COLMAP_SfM.log (camera poses)
│ ├─ Barn.json (cropfiles)
│ ├─ Barn.ply (ground-truth point cloud)
│ ├─ Barn_trans.txt (colmap-to-ground-truth transformation)
│ ├─ database.db (colmap generated database)
│ ├─ transforms.json (generated)
│ ├─ sparse/ (formatted cameras)
│ ├─ images/ (processed images)
│ └─ images_raw (raw input images downloaded from Tanks and Temples website)
│ ├─ 000001.png
│ ...
...
DTU_eval (official ground truth)
├─ ObsMask/ (observisibility masks)
└─ Points/ (stl point clouds)
DTU_2dgs (2DGS pre-processed training set)
├─ scan24/
...
360_v2 (Official Mip-NeRF 360 dataset)
├─ bicycle/
...
Following the below examples to reproduce the evaluation on the datasets.
# Run training on the datasets
python scripts/run_dtu.py output/dtu
python scripts/run_tnt.py output/tnt
python scripts/run_mip360.py output/mip360
# Summarize results
python scripts/stat/stat_dtu.py output/dtu
python scripts/stat/stat_tnt.py output/tnt
python scripts/stat/stat_360.py output/mip360- The reproduced mesh results are provided on Hugging Face.
Note: The evaluation scripts have a non-trivial influence on mesh quality measurement. In our project, we use the original Tanks and Temples toolbox tranferred to updated Open3D, and DTU evaluation script based on DTUeval-python.
Besides, there are also some available customized TnT evaluation scripts used in SVRaster, 2DGS, GOF and so on, which have different implementations. These may produce slightly higher results than the previous one.
This method is mainly developed on the open-source projects gaussian-splatting, Depth-Anything-3, and PGSR. Partial scripts are borrowed from VGGT and SVRaster. Thanks for their great contributions.
Please kindly consider citing as below if you find this repository helpful in your project:
@inproceedings{li2026ambisur,
title={Revisiting Photometric Ambiguity for Accurate Gaussian-Splatting Surface Reconstruction},
author={Li, Jiahe and Zhang, Jiawei and Bai, Xiao and Zheng, Jin and Yu, Xiaohan and Gu, Lin and Lee, Gim Hee},
booktitle={International Conference on Machine Learning},
year={2026},
organization={PMLR}
}