A unified evaluation framework for 3D reconstruction, used in π³.
This repo includes unofficial inference of some popular methods (e.g. VGGT, MoGe), if authors of these methods have any concerns about our implementation, please feel free to pull request or issue. (pull request is welcome!)
- Monocular Depth Estimation
- Video Depth Estimation
- Relative Camera Pose Estimation
- Multi-view Reconstruction (Point Map Estimation)
The root config file of all evaluations is configs/eval.yaml
, however you don't need to edit it
- All main hyperparameters you need are in
configs/evaluation/xxxxx.yaml
- Sometimes you may want to change the dataset config in
configs/data/xxxxx.yaml
, or the model config inconfigs/model/xxxxx.yaml
- Depth Estimation: We follow MonST3R to prepare Sintel, Bonn, KITTI and NYU-v2.
- Camera Pose Estimation
- Angular: We follow VGGT to prepare Co3Dv2, and we afford our script for RealEstate10k preprocessing.
- Distance: We follow MonST3R to prepare Sintel, TUM-dynamics and ScanNetv2.
- Point Map Estimation: We follow Spann3R to prepare 7-Scenes, Neural-NRGBD and DTU. We afford our script for ETH3D preprocessing.
We provide reference-only preprocessing scripts under
datasets/preprocess
. Please ensure you have obtained the necessary licenses from the original dataset providers before proceeding.
See monodepth/README.md for more details.
python monodepth/infer.py
# torchrun --nnodes=1 --nproc_per_node=8 monodepth/infer_mp.py # accelerate with multi gpus
python monodepth/eval.py
configs in configs/evaluation/videodepth.yaml
, see videodepth/README.md for more details.
python videodepth/infer.py
# torchrun --nnodes=1 --nproc_per_node=8 videodepth/infer_mp.py # accelerate with multi gpus
python videodepth/eval.py
configs in configs/evaluation/relpose-angular.yaml
, see relpose/README.md for more details.
# python relpose/sampling.py # to generate seq-id-maps under datasets/seq-id-maps, which is provided in this repo
python relpose/eval_angle.py
# torchrun --nnodes=1 --nproc_per_node=8 videodepth/eval_angle_mp.py # accelerate with multi gpus
python relpose/eval_dist.py
# torchrun --nnodes=1 --nproc_per_node=8 videodepth/eval_dist_mp.py # accelerate with multi gpus
See mv_recon/README.md for more details.
# python mv_recon/sampling.py # to generate seq-id-maps under datasets/seq-id-maps, which is provided in this repo
python mv_recon/eval.py
# torchrun --nnodes=1 --nproc_per_node=8 mv_recon/eval_mp.py # accelerate with multi gpus
Our work mainly builds upon:
If you find our work useful, please consider citing:
@misc{wang2025pi3,
title={$\pi^3$: Scalable Permutation-Equivariant Visual Geometry Learning},
author={Yifan Wang and Jianjun Zhou and Haoyi Zhu and Wenzheng Chang and Yang Zhou and Zizun Li and Junyi Chen and Jiangmiao Pang and Chunhua Shen and Tong He},
year={2025},
eprint={2507.13347},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.13347},
}
This project is licensed under CC BY-NC-SA 4.0 License. See the LICENSE file and https://creativecommons.org/licenses/by-nc-sa/4.0/ for details.