This repository contains implementations of various Reinforcement Learning (RL) algorithms to better understand their mechanics and applications.
Proximal Policy Optimization
Paper link: Proximal Policy Optimization (PPO)
TRPO https://rlhfbook.com/c/11-policy-gradients.html#reinforce-leave-one-out-rloo