SnapMoGen: Human Motion Generation from Expressive Texts

If you find our code or paper helpful, please consider starring this repository and citing the following:

@misc{snapmogen2025,
      title={SnapMoGen: Human Motion Generation from Expressive Texts}, 
      author={Chuan Guo and Inwoo Hwang and Jian Wang and Bing Zhou},
      year={2025},
      eprint={2507.09122},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.09122}, 
}

📮 News

📢 2023-11-29 --- Initialized the webpage and git project.

📍 Getting Started

1.1 Set Up Conda Environment

conda env create -f environment.yml
conda activate momask-plus

🔁 Alternative: Pip Installation

If you encounter issues with Conda, you can install the dependencies using pip:

pip install -r requirements.txt

✅ Tested on Python 3.8.20.

1.2 Models and Dependencies

Download Pre-trained Models

bash prepare/download_models.sh

Download Evaluation Models and Gloves

(For evaluation only.)

bash prepare/download_evaluators.sh
bash prepare/download_glove.sh

Troubleshooting

To address the download error related to gdown: "Cannot retrieve the public link of the file. You may need to change the permission to 'Anyone with the link', or have had many accesses". A potential solution is to run pip install --upgrade --no-cache-dir gdown, as suggested on wkentaro/gdown#43. This should help resolve the issue.

(Optional) Download Manually

Visit [Google Drive] to download the models and evaluators mannually.

1.3 Download the Datasets

HumanML3D - Follow the instruction in HumanML3D, then copy the dataset to your data folder:

cp -r ./HumanML3D/ your_data_folder/HumanML3D

SnapMoGen - Download the data from huggingface, then place it in the following directory:

cp -r ./SnapMoGen your_data_folder/SnapMoGen

🚀 Play with MoMask++

Remember to update the data.root_dir in all the config/*.yaml files - with your own data directory path.

2.1 Motion Generation

To generate motion from your own text prompts, use:

python gen_momask_plus.py

You can modify the inference configuration (e.g., number of diffusion steps, guidance scale, etc.) in config/eval_momaskplus.yaml.

2.2 Evaluation

Run the following scripts for quantitive evaluation:

python eval_momask_plus_hml.py    # Evaluate on HumanML3D dataset
python eval_momask_plus.py        # Evaluate on SnapMoGen dataset

2.3 Training

There are two main components in MoMask++, a multi-scale residual motion VQVAE and a generative masked Transformer.

All checkpoints will be stored under /checkpoint_dir.

Multi-scale Motion RVQVAE

python train_rvq_hml.py           # Train RVQVAE on HumanML3D
python train_rvq.py               # Train RVQVAE on SnapMoGen

Configuration files:

config/residual_vqvae_hml.yaml (for HumanML3D)
config/residual_vqvae.yaml (for SnapMoGen)

Generative Masked Transformer

python train_momask_plus_hml.py   # Train on HumanML3D
python train_momask_plus.py       # Train on SnapMoGen

Configuration files:

config/train_momaskplus_hml.yaml (for HumanML3D)
config/train_momaskplus.yaml (for SnapMoGen)

Remember to change vq_name and vq_ckpt to your VQ name and VQ checkpoint in these two configuration files. Training accuracy at around 0.25 is normal.

Global Motion Refinement

We use a separate lightweight root motion regressor to refine the root trajectory. In particular, this regressor is trained given local motion features to predict root linear velocities. During motion generation, we use this regressor to re-predict the resulting root trajectories which effectively reduces sliding feet.

🎬 Visualization

All animations were manually rendered in Blender using Bitmoji characters.
An example character is available here, and we use this Blender scene for animation rendering.

Retargeting

We recommend using the Rokoko Blender add-on (v1.4.1) for seamless motion retargeting.

⚠️ Note: All motions in SnapMoGen use T-Pose as the rest pose.

If your character rig is in A-Pose, use the rest_pose_retarget.py to convert between T-Pose and A-Pose rest poses:

Acknowlegements

We sincerely thank the open-sourcing of these works where our code is based on:

MoMask, VAR, deep-motion-editing, Muse, vector-quantize-pytorch, T2M-GPT, MDM and MLD

Misc

Contact [email protected] for further questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SnapMoGen: Human Motion Generation from Expressive Texts

📮 News

📍 Getting Started

1.1 Set Up Conda Environment

🔁 Alternative: Pip Installation

1.2 Models and Dependencies

Download Pre-trained Models

Download Evaluation Models and Gloves

Troubleshooting

(Optional) Download Manually

1.3 Download the Datasets

🚀 Play with MoMask++

2.1 Motion Generation

2.2 Evaluation

2.3 Training

Multi-scale Motion RVQVAE

Generative Masked Transformer

Global Motion Refinement

🎬 Visualization

Retargeting

Acknowlegements

Misc

Star History

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
common		common
config		config
dataset		dataset
model		model
prepare		prepare
trainers		trainers
utils		utils
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
eval_momask_plus.py		eval_momask_plus.py
eval_momask_plus_hml.py		eval_momask_plus_hml.py
eval_rvq.py		eval_rvq.py
gen_momask_plus.py		gen_momask_plus.py
requirements.txt		requirements.txt
rest_pose_retarget.py		rest_pose_retarget.py
train_momask_plus.py		train_momask_plus.py
train_momask_plus_hml.py		train_momask_plus_hml.py
train_rvq.py		train_rvq.py
train_rvq_hml.py		train_rvq_hml.py

License

snap-research/SnapMoGen

Folders and files

Latest commit

History

Repository files navigation

SnapMoGen: Human Motion Generation from Expressive Texts

📮 News

📍 Getting Started

1.1 Set Up Conda Environment

🔁 Alternative: Pip Installation

1.2 Models and Dependencies

Download Pre-trained Models

Download Evaluation Models and Gloves

Troubleshooting

(Optional) Download Manually

1.3 Download the Datasets

🚀 Play with MoMask++

2.1 Motion Generation

2.2 Evaluation

2.3 Training

Multi-scale Motion RVQVAE

Generative Masked Transformer

Global Motion Refinement

🎬 Visualization

Retargeting

Acknowlegements

Misc

Star History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages