Skip to content

reshalfahsi/cheetah-robot-ppo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Teaching a Cheetah Robot to Run: Solving Continuous Control in Simulated Locomotion with Proximal Policy Optimization

colab

This project tackles the challenge of training a simulated Cheetah robot to run efficiently using reinforcement learning, focusing on the complexities of continuous action control. This project employs Proximal Policy Optimization (PPO), a stable and effective policy gradient algorithm, implemented with PyTorch, MuJoCo for realistic physics simulation, and Gymnasium as the environment framework. The policy and value networks, structured as multi-layer perceptrons, model actions via a Gaussian distribution, refined through iterative training with a clipped objective and Generalized Advantage Estimation (GAE) for stability. Key hyperparameters are tuned to optimize performance, and a deterministic policy—using the mean action—is adopted during evaluation to ensure consistency in the deterministic MuJoCo setting. The result is a Cheetah robot that achieves stable, agile locomotion with rising rewards, offering insights into continuous control and potential applications for real-world robotics.

Experiment

Explore this notebook to watch the Cheetah robot in action and understand how PPO works controlling the robot.

Result

Reward Curve

reward_curve
The reward tends to climb up in the course of the Cheetah robot training.

Qualitative Result

Here, the Cheetah robot is assigned to speedily maneuver forward as distantly as possible without stumbling and slipping, which can hinder its brisk gait mobility. It is illustrated in the ensuing GIF.

qualitative_cheetah_robot
The Cheetah robot sprints predominantly stable in the simulated environment following the learned deterministic policy of PPO.

Credit

About

Teaching a Cheetah Robot to Run: Solving Continuous Control in Simulated Locomotion with Proximal Policy Optimization

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors