Course project on the performance of PPO and Policy Gradient methods in the OpenAI Gym Lunar Lander environment. We provide a thorough introduction to Reinforcement Learning and Policy Gradient Methods and conduct experiments with PPO and REINFORCE in the Lunar Lander environment.
Please refer to the ./report folder for details.