A collection of Tensorflow implementations of reinforcement learning models. Models are evaluated in OpenAI Gym environments.
| Model | Code | References |
|---|---|---|
| Cross-Entropy Method | run_cem_cartpole | Cross-entropy method |
| Tabular Q Learning | rl/tabular_q_learner | Sutton and Barto, Chapter 8 |
| Deep Q Network | rl/neural_q_learner | Mnih et al. |
| Double Deep Q Network | rl/neural_q_learner | van Hasselt et al. |
| REINFORCE Policy Gradient | rl/pg_reinforce | Sutton et al. |
| Actor-critic Policy Gradient | rl/pg_actor_critic | Minh et al. |