Soft Actor-Critic (SAC) for MuJoCo Humanoid Environment

A PyTorch implementation of the Soft Actor-Critic (SAC) algorithm to train an agent to play with the Humanoid environment from MuJoCo.

Installation

You can clone the repository and install the required dependencies using Poetry or pip. This project requires Python 3.13.

Using Poetry (Recommended)

# 1. Clone the repository
git clone https://github.com/giansimone/sac-mujoco-humanoid.git
cd sac-mujoco-humanoid

# 2. Initialize environment and install dependencies
poetry env use python3.13
poetry install

# 3. Activate the shell
eval $(poetry env activate)

Using Pip

# 1. Clone the repository
git clone https://github.com/giansimone/sac-mujoco-humanoid.git
cd sac-mujoco-humanoid

# 2. Create and activate virtual environment
python3.13 -m venv venv
source venv/bin/activate

# 3. Install package in editable mode
pip install -e .

Project Structure

sac-mujoco-humanoid/
├── sac_mujoco_humanoid/
│   ├── __init__.py
│   ├── agent.py       # SAC implementation (Actor/Critic)
│   ├── buffer.py      # Replay Buffer
│   ├── config.yaml    # Training hyperparameters
│   ├── environment.py # Gym environment wrappers
│   ├── enjoy.py       # Evaluation script
│   ├── export.py      # Hugging Face export script
│   ├── model.py       # PyTorch Network definitions
│   ├── train.py       # Main training loop
│   └── utils.py
├── .gitignore
├── LICENSE
├── README.md
└── pyproject.toml

Usage

Ensure you are in the sac_mujoco_humanoid source directory where config.yaml is located before running these commands.

cd sac-mujoco-humanoid

Training

Train a SAC agent with the default configuration.

Note: The Replay Buffer pre-allocates memory. Ensure your system has at least 8GB of RAM available.

python -m train

Configuration

Edit config.yaml to customise training parameters.

#Environment
env_name: Humanoid-v5

#Network Architecture
hidden_dim: 256

#Training
total_steps: 2_000_000
buffer_size: 1_000_000
batch_size: 256
start_steps: 10_000
updates_per_step: 1

#SAC Agent
lr: 0.0003
gamma: 0.99
tau: 0.005
alpha: 0.2
auto_tune_alpha: True

#Logging
log_dir: runs/

#System
seed: 42

Enjoying a Trained Agent

Watch a trained agent by running the enjoy script. Point the artifact argument to your saved model file.

python -m enjoy \
    --artifact runs/sac_Humanoid-v5_YYYY-MM-DD_HHhMMmSSs/final_model.pt \
    --num-episodes 5

Exporting to Hugging Face Hub

Share your trained model, config, and a replay video to the Hugging Face Hub.

python -m export \
    --username YOUR_HF_USERNAME \
    --repo-name sac-mujoco-humanoid \
    --artifact-path runs/sac_Humanoid-v5_YYYY-MM-DD_HHhMMmSSs/final_model.pt \
    --movie-fps 30 \
    --n-eval 10

This will automatically:

Upload the model weights and config.
Generate a model card with evaluation metrics (Mean Reward +/- Std).
Record and upload a video of the agent.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Soft Actor-Critic (SAC) for MuJoCo Humanoid Environment

Installation

Using Poetry (Recommended)

Using Pip

Project Structure

Usage

Training

Configuration

Enjoying a Trained Agent

Exporting to Hugging Face Hub

License

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
sac_mujoco_humanoid		sac_mujoco_humanoid
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

License

giansimone/sac-mujoco-humanoid

Folders and files

Latest commit

History

Repository files navigation

Soft Actor-Critic (SAC) for MuJoCo Humanoid Environment

Installation

Using Poetry (Recommended)

Using Pip

Project Structure

Usage

Training

Configuration

Enjoying a Trained Agent

Exporting to Hugging Face Hub

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages