Project Page | Arxiv | Data
Zihan Wang, Jeff Tan, Tarasha Khurana*, Neehar Peri*, Deva Ramanan
Carnegie Mellon University
* Equal Contribution
git clone --recurse-submodules https://github.com/ImNotPrepared/MonoFusion
cd MonoFusion/
conda create -n monofusion python=3.10
conda activate monofusion
Update requirements.txt
with correct CUDA version for PyTorch and cuUML,
i.e., replacing cu122
and cu12
with your CUDA version.
pip install -r requirements.txt
pip install git+https://github.com/nerfstudio-project/gsplat.git
Task | Status | Due Date |
---|---|---|
Drop data and environ build guide | ✅ Done | - |
Preprocessing scripts | ⏳ Todo | in a week |
Drop Code | ⏳ Todo | between ICLR and ICCV |
We understand that data processing can be complex and time-consuming. Our solution streamlines this workflow - with a single run, you'll receive undistorted, synchronized sparse-view data complete with all necessary priors.
To work with Ego-Exo4D data, follow these steps:
-
Obtain License: Get a license for Ego-Exo4D at https://docs.ego-exo4d-data.org/getting-started/
-
Download VRS Files: Download VRS files with RGB stream for a specific take:
egoexo -o <output_directory> --parts take_vrs --uids <uid1>
Note: VRS files are required to ensure synchronized data streams.
-
Extract Images: Extract undistorted images from the VRS sequences:
viewer_map --vrs aria01.vrs --task vis --resize 512
The default resize resolution is 512 pixels
We have one-button-push priors generation script that saves depth maps, human masks, dino feature, and 2D tracks.
./fetch_priors.sh
If you find our data, code processing, or project useful, please kindly consider citing our work:
@misc{wang2025monofusionsparseview4dreconstruction,
title={MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion},
author={Zihan Wang and Jeff Tan and Tarasha Khurana and Neehar Peri and Deva Ramanan},
year={2025},
eprint={2507.23782},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.23782},
}