Skip to content

Implementation of a DQN for Mario that I started on a plane ride once.

Notifications You must be signed in to change notification settings

isaprykin/mario-rl

Repository files navigation

Mario RL (Double DQN)
=====================

This project trains a Double DQN agent to play Super Mario Bros using
frame-stacked observations and a target network. The original notebook
(`mario_dqn.ipynb`) is preserved for reference; the recommended workflow
is now script-based.

Requirements
------------
- Python 3.9+
- `tensorflow`, `gym-super-mario-bros`, `nes-py`
- `numpy<2.0` and `gym<0.26` (compatibility with `nes-py`/`gym-super-mario-bros`)
- A valid Super Mario Bros ROM as required by `gym-super-mario-bros`

Quick Start
-----------
Train (default settings):
```
python3 dqn_train.py --model-dir models/mario_dqn
```

Train faster with multiple environments (single process):
```
python3 dqn_train.py --model-dir models/mario_dqn --num-envs 4
```

Train faster with parallel environment workers:
```
python3 dqn_train.py --model-dir models/mario_dqn --parallel-envs 4
```

Delay learning until the replay buffer is prefetched:
```
python3 dqn_train.py --model-dir models/mario_dqn --prefetch-steps 10000
```

Resume training:
```
python3 dqn_train.py --model-dir models/mario_dqn --resume
```

Play a trained model:
```
python3 dqn_play.py --model-dir models/mario_dqn --render
```

Key Files
---------
- `dqn_train.py`: main training loop (Double DQN + prioritized replay)
- `dqn_play.py`: run a trained agent
- `mario_env.py`: environment creation + preprocessing (frame skip/stack)
- `replay_buffer.py`: prioritized replay buffer
- `dqn_model.py`: Q-network definition
- `mario_dqn.ipynb`: original notebook (legacy reference)

Notes
-----
- The default replay buffer size is conservative to avoid huge memory use.
  Increase `--replay-size` if you have more RAM.
- Models are saved as `model.keras` inside the directory passed via
  `--model-dir`, along with `training_state.json` and TensorBoard logs
  under `logs/`.

About

Implementation of a DQN for Mario that I started on a plane ride once.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors