This repository tries to implement all reinforcement learning algorithms and examples from Reinforcement Learning: An Introduction by Richard S. Sutton and Andrew G. Barto.
Dependencies are specified in pyproject.toml and managed using rye. You can use package managers such as uv or rye to install dependencies (e.g., rye sync).
scripts/ contains codes and notebooks for each algorithm or example. src/ includes modules, environments, and utilities used across different scripts.
| Chapter | Title | Status | Contents |
|---|---|---|---|
| 1 | Introduction | ⬜ TODO | Tic-tac-toe environment. Working on a simple TD-learning agent. |
| 2 | Multi-armed Bandits | ⬜ TODO | k-armed bandit environment. |
| 3 | Finite Markov Decision Processes | ⬜ TODO | |
| 4 | Dynamic Programming | ⬜ TODO | |
| 5 | Monte Carlo Methods | ⬜ TODO | |
| 6 | Temporal-Difference Learning | ⬜ TODO | |
| 7 | n-step Bootstrapping | ⬜ TODO | |
| 8 | Planning and Learning with Tabular Methods | ⬜ TODO | |
| 9 | On-policy Prediction with Approximation | ⬜ TODO | |
| 10 | On-policy Control with Approximation | ⬜ TODO | |
| 11 | Off-policy Methods with Approximation | ⬜ TODO | |
| 12 | Eligibility Traces | ⬜ TODO | |
| 13 | Policy Gradient Methods | ⬜ TODO | |
| 14 | Psychology | ⬜ TODO | |
| 15 | Neuroscience | ⬜ TODO | |
| 16 | Applications and Case Studies | ⬜ TODO | |
| 17 | Frontiers | ⬜ TODO |