GitHub - zc-alexfan/hold: [CVPR 2024✨Highlight] Official repository for HOLD, the first method that jointly reconstructs articulated hands and objects from monocular videos without assuming a pre-scanned object template and 3D hand-object training data.

[CVPR'24 Highlight] HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

[ Project Page ] [ Paper ] [ SupMat ] [ ArXiv ] [ Video ] [ HOLD Account ] [ ICCV'25 HOLD+ARCTIC Challenge ]

Authors: Zicong Fan, Maria Parelli, Maria Eleni Kadoglou, Muhammed Kocabas, Xu Chen, Michael J. Black, Otmar Hilliges

News

✨3DV 2026: Looking for hand scans data? PALM is a large-scale dataset containing high-quality 13k registered 3dMD hand scans of 263 subjects and 90k calibrated multiview RGB images. See PALM for details.

🚀 Register a HOLD account here for news such as code release, downloads, and future updates!

2025.07.04: Join our ICCV competition: Two hand + rigid object using HOLD on ARCTIC!
2024.07.04: Join our ECCV competition: Two hand + rigid object using HOLD on ARCTIC!
2024.07.04: HOLD beta is released!
2024.04.04: HOLD is awarded CVPR highlight!
2024.02.27: HOLD is accepted to CVPR'24! Working on code release!

This is a repository for HOLD, a method that jointly reconstructs hands and objects from monocular videos without assuming a pre-scanned object template.

HOLD can reconstruct 3D geometries of novel objects and hands:

Potential directions from HOLD

Features

Instructions to download in-the-wild videos from HOLD as well as preprocessed data
Scripts to preprocess and train on custom videos
A volumetric rendering framework to reconstruct dynamic hand-object interaction
A generalized codebase for single and two hand interaction with objects
A viewer to interact with the prediction
Code to evaluate and compare with HOLD in HO3D

TODOs

Tips on good reconstruction
Clean the code further
Support arctic for two-hand + rigid object setting

Documentation

Setup environment and downloads: see docs/setup.md
Training, evaluation, and visualization on preprocessed sequences: see docs/usage.md
Preprocess custom sequences: see docs/custom.md
Data documentation (checkpoints, dataset, log folder): see docs/data_doc.md
Instructions for using HOLD on ARCTIC: see docs/arctic.md

Getting started

Get a copy of the code:

git clone https://github.com/zc-alexfan/hold.git
cd hold; git submodule update --init --recursive

Setup environments
- Follow the instructions here: docs/setup.md.
- You may skip external dependencies for now.
Train on a preprocessed sequence
- Start with one of our preprocessed in-the-wild sequences, such as hold_bottle1_itw.
- Familiarize yourself with the usage guidelines in docs/usage.md for this preprocessed sequence.
- This will enable you to train, render HOLD, and experiment with our interactive viewer.
- At this stage, you can also explore the HOLD code in the ./code directory.
Set up external dependencies and process custom videos
- After understanding the initial tools, set up the "external dependencies" as outlined in docs/setup.md.
- Preprocess the images from the hold_bottle1_itw sequence by following the instructions in docs/custom.md.
- Train on this sequence to learn how to build a custom dataset.
- You can capture your own custom video and reconstruct it in 3D at this point.
- Most preprocessing artifact files are documented in docs/data_doc.md, which you can use as a reference.
Two-hand setting: Bimanual category-agnostic reconstruction
- At this point, you can preprocess and train on a custom single-hand sequence.
- Now you can take on the bimanual category-agnostic reconstruction challenge!
- Following the instruction in docs/arctic.md to reconstruct two-hand manipulation of ARCTIC sequences.

Official Citation

@inproceedings{fan2024hold,
  title={{HOLD}: Category-agnostic 3d reconstruction of interacting hands and objects from video},
  author={Fan, Zicong and Parelli, Maria and Kadoglou, Maria Eleni and Kocabas, Muhammed and Chen, Xu and Black, Michael J and Hilliges, Otmar},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={494--504},
  year={2024}
}

Star History

Contact

For technical questions, please create an issue. For other questions, please contact the first author.

Acknowledgments

The authors would like to thank: Benjamin Pellkofer for IT/web support; Chen Guo, Egor Zakharov, Yao Feng, Artur Grigorev for insightful discussion; Yufei Ye for DiffHOI code release.

Our code benefits a lot from Vid2Avatar, aitviewer, VolSDF, NeRF++ and SNARF. If you find our work useful, consider checking out their work.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
bash		bash
code		code
common		common
docs		docs
generator		generator
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR'24 Highlight] HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

News

Potential directions from HOLD

Features

TODOs

Documentation

Getting started

Official Citation

Star History

Contact

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[CVPR'24 Highlight] HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

News

Potential directions from HOLD

Features

TODOs

Documentation

Getting started

Official Citation

Star History

Contact

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages