AvatarMesh: 3D Avatar Mesh Generation from a Single Image

Note: This project originates from my earlier work and has been recently reorganized and released for reproducibility.

AvatarMesh is a framework for generating high-quality 3D human avatars from a single input image.

The pipeline consists of two main stages:

Multi-view Image Generation

From a single input image, we synthesize multiple novel views using a diffusion-based sparse-view generation model.
By focusing on clear geometry, this step provides strong priors for 3D reconstruction.
3D Avatar Reconstruction
The generated multi-view images are used to reconstruct the 3D geometry through a coarse-to-fine pipeline:
- Coarse Reconstruction: A large sparse-view transformer-based reconstruction model with SDF representation builds the initial coarse mesh.
- Refinement: Using SMPL model priors and multi-view normal maps, the coarse mesh is refined to achieve accurate geometry and rich details.

Installation

We recommend Python >= 3.10, PyTorch >= 2.1.0, and CUDA >= 12.1.

conda create --name avatarmesh python=3.10
conda activate avatarmesh
pip install -U pip

# Install ninja
conda install ninja

# Install CUDA (adjust version if needed)
conda install cuda -c nvidia/label/cuda-12.1.0

# Install PyTorch and xformers
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
pip install xformers==0.0.22.post7

# Install other requirements
pip install -r requirements.txt

Usage

1. Multi-view Image Generation

We adopt SV3D (repo) to generate 21 novel views from a single input image.

# Download pretrained weights
wget https://huggingface.co/stabilityai/sv3d/resolve/main/sv3d_u.safetensors -P checkpoints/

# Run SV3D-u
python scripts/sampling/simple_video_sample.py \
    --input_path path/to/image.png \
    --version sv3d_u

2. Coarse Mesh Reconstruction

Generate a coarse 3D mesh from the 21 synthesized views:

python run.py configs/instant-mesh-large.yaml /path/to/img_dir/ --save_video

3. Normal Map Generation

Generate normal maps for the same views:

# Edit paths in command/infer.sh before running
bash command/image2normal.sh

The results (mesh, images, video) will be saved in {$out_dir}.

4. Mesh Refinement

Refine the coarse mesh using SMPL priors and normal maps:

python mesh_refine.py

Results

Multi-view Image and Normal Map Generation

From a single input image, our method generates multiple novel views and their corresponding normal maps:

Coarse-to-Fine 3D Reconstruction

Based on the generated multi-view images and normal maps, we reconstruct a coarse mesh and further refine it with SMPL priors:

Acknowledgements

We thank the authors of the following projects for their excellent contributions:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
apps		apps
assets		assets
ckpt		ckpt
configs		configs
lib		lib
src		src
.DS_Store		.DS_Store
README.md		README.md
image2normal.sh		image2normal.sh
mesh_refine.py		mesh_refine.py
requirements.txt		requirements.txt
run.py		run.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AvatarMesh: 3D Avatar Mesh Generation from a Single Image

Installation

Usage

1. Multi-view Image Generation

2. Coarse Mesh Reconstruction

3. Normal Map Generation

4. Mesh Refinement

Results

Multi-view Image and Normal Map Generation

Coarse-to-Fine 3D Reconstruction

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AvatarMesh: 3D Avatar Mesh Generation from a Single Image

Installation

Usage

1. Multi-view Image Generation

2. Coarse Mesh Reconstruction

3. Normal Map Generation

4. Mesh Refinement

Results

Multi-view Image and Normal Map Generation

Coarse-to-Fine 3D Reconstruction

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages