Pytorch-original-AlphaZero

My implementation of the original AlphaZero paper:
"A general reinforcement learning algorithm that masters Chess, Shogi, and Go through self-play"
→ Silver et al., DeepMind, 2017

🚀 Overview

This project reimplements the core ideas of AlphaZero using PyTorch from scratch.
It supports training and evaluation on board games like Go and Gomoku, with modular MCTS, a deep residual neural network, and a scalable self-play pipeline.
Designed with research reproducibility, educational clarity, and modular experimentation in mind.

🔥 Key Features

🧠 MCTS (v1 & v2) — Custom implementations with tree reuse, visit counts, and exploration noise
🏗️ Neural Network — Deep residual CNN with separate policy and value heads
♻️ Self-Play Training — Built-in actor, learner, and replay loop
📊 Evaluation Framework — Elo rating, agent vs agent, and CrazyStone integration
📈 TensorBoard Logging — For real-time tracking of training metrics
🐳 Docker Support — Reproducible environment for fast deployment
🌐 Notebook UI — Play interactively against trained agents in-browser
🔍 SGF Support — Replay and analyze Go games using Sabaki or similar tools

🗂️ Project Structure


.
├── alpha\_zero/             # Core logic (MCTS, model, pipeline)
│   ├── core/                # MCTS, network, rating, replay
│   ├── envs/                # Go & Gomoku environments (Gym-style)
│   └── utils/               # SGF parsing, data transformations, logging
├── eval_play/               # GUI/CLI for evaluating trained agents
├── logs/                    # TensorBoard and evaluation logs
├── docker/                  # Dockerfile for reproducibility
├── others/                  # Analysis scripts, Elo graphs, score systems
├── unit_tests/              # Tests for environments, MCTS, transformations
├── plot_go.py               # Training visualization
├── plot_gomoku.py
├── run_unit_tests.sh
└── README.md

⚙️ Getting Started

✅ 1. Install with Docker (Recommended)

docker build -t alphazero-pytorch -f docker/Dockerfile .
docker run -it --rm -p 6006:6006 alphazero-pytorch

✅ 2. Or install manually

pip install -r requirements.txt

🏋️ Train an Agent

Train Go (9x9 board):

python alpha_zero/training_go.py

Train Gomoku (13x13 board):

python alpha_zero/training_gomoku.py

TensorBoard logs saved to logs/:

tensorboard --logdir logs/

🧪 Evaluate Agents

GUI/Terminal Match:

python eval_play/eval_agent_go.py                 # GUI
python eval_play/eval_agent_go_cmd.py             # Terminal

Match against Crazystone (manual proxy):

python eval_play/eval_agent_go_cmd.py --opponent crazystone

🎮 Play in Browser (Notebook)

Launch Jupyter and play interactively in-browser:

jupyter notebook notebooks/play_agent_demo.ipynb

📊 Visualize Progress

python plot_go.py        # Elo & Win-rate vs Checkpoints
python plot_gomoku.py

🧠 Research Notes

Training is scaled-down for accessibility (fewer simulations, smaller nets)
Designed to be hackable: plug in your own game, change architecture, modify MCTS
All games stored in SGF format for compatibility with viewers like Sabaki

📌 To-Do & Roadmap

Add full Chess and Shogi support
Integrate Prioritized Replay Buffer
Distributed Self-Play with Ray or Dask
Improved GUI (Streamlit / WebRTC)
Support for AlphaZero variants (MuZero, EfficientZero)

📜 License

This project is licensed under the MIT License. See LICENSE for full details.

🙏 Acknowledgments

DeepMind for AlphaZero and GTP logic
MiniGo, Leela Zero, and Sabaki for inspiration
OpenAI Gym and PyTorch for frameworks

🧠 Author

Pranith Chowdary Karumanchi Research-focused AI Engineer | ML + RL + LLMs GitHub • LinkedIn

Feel free to ⭐ this repo if you find it useful!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pytorch-original-AlphaZero

🚀 Overview

🔥 Key Features

🗂️ Project Structure

⚙️ Getting Started

✅ 1. Install with Docker (Recommended)

✅ 2. Or install manually

🏋️ Train an Agent

Train Go (9x9 board):

Train Gomoku (13x13 board):

🧪 Evaluate Agents

GUI/Terminal Match:

Match against Crazystone (manual proxy):

🎮 Play in Browser (Notebook)

📊 Visualize Progress

🧠 Research Notes

📌 To-Do & Roadmap

📜 License

🙏 Acknowledgments

🧠 Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Alpha-Zero		Alpha-Zero
eval_play		eval_play
logs		logs
others		others
unit_tests		unit_tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
plot_go.py		plot_go.py
plot_gomoku.py		plot_gomoku.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_go_mass_matches.sh		run_go_mass_matches.sh
run_unit_tests.sh		run_unit_tests.sh
setup.cfg		setup.cfg

License

PranithChowdary/Pytorch-original-AlphaZero

Folders and files

Latest commit

History

Repository files navigation

Pytorch-original-AlphaZero

🚀 Overview

🔥 Key Features

🗂️ Project Structure

⚙️ Getting Started

✅ 1. Install with Docker (Recommended)

✅ 2. Or install manually

🏋️ Train an Agent

Train Go (9x9 board):

Train Gomoku (13x13 board):

🧪 Evaluate Agents

GUI/Terminal Match:

Match against Crazystone (manual proxy):

🎮 Play in Browser (Notebook)

📊 Visualize Progress

🧠 Research Notes

📌 To-Do & Roadmap

📜 License

🙏 Acknowledgments

🧠 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages