Qiskit Gym 🏋️‍♀️

Qiskit Gym is a collection of quantum information science problems formulated as reinforcement learning environments.

🚀 Key Features

Train AI agents to discover efficient and near-optimal quantum circuit implementations for various quantum computing tasks.

🎯 Purpose-Built RL Environments: Specialized Gymnasium-compatible environments for quantum circuit synthesis problems
🔧 Hardware-Aware Environments: Design environments matching real quantum hardware coupling maps
🔄 Flexible Gate Sets: Define custom gate sets as part of your environments
🧠 State-of-the-Art RL Integration: Built-in support for PPO, Alpha Zero, and custom policy configurations
⚡ High-Performance Backend: Rust-powered core for lightning-fast quantum circuit operations
💾 Model Persistence: Save and load trained models for deployment
⚖️ Scalable Training: Efficient parallel environment execution
📊 Built-in TensorBoard Integration: For training visualization and performance tracking
🔌 Qiskit integration: You can use Qiksit seamlessly to define your problem and use your trained models.

🎲 Supported Problem Types

🔄 Permutation Synthesis

Learn to implement arbitrary qubit permutations using minimal SWAP gates on constrained coupling maps.

🔢 Linear Function Synthesis

Decompose linear Boolean functions into efficient quantum circuits using CNOT gates.

🌊 Clifford Synthesis

Generate optimal implementations of Clifford group elements with customizable gate sets.

🎛️ Pauli Network Synthesis

Optimize circuits containing Clifford gates and parametric rotations (RX, RY, RZ) for variational algorithms.

🔭 Roadmap and Vision

There are lots of interesting quantum computing problems that can be addressed with RL. We hope to centralize the problem formulations here to make it easier for Quantum and AI researchers to collaborate in taclikng them.

For future releases we plan to include envs for:

Qubit routing
Clifford+T synthesis

We are also open to env requests and contributions!

Stay tuned!

🏃‍♂️ Quick Start

Get started with quantum circuit synthesis RL in minutes! Check out our comprehensive tutorial:

👉 examples/intro.ipynb - Interactive Introduction Notebook

This Jupyter notebook walks you through:

Setting up different synthesis environments
Training RL agents from scratch
Evaluating and using trained models
Visualizing quantum circuits and training progress

🛠️ Installation

git clone https://github.com/AI4quantum/qiskit-gym.git
cd qiskit-gym
pip install -e .

💡 Example Usage

Here's how to set up a permutation synthesis environment and train an RL agent:

from qiskit_gym.envs import PermutationGym
from qiskit_gym.rl import RLSynthesis, PPOConfig, BasicPolicyConfig
from qiskit.transpiler import CouplingMap

# Create a 3x3 grid coupling map environment
cmap = CouplingMap.from_grid(3, 3, bidirectional=False)
env = PermutationGym.from_coupling_map(cmap)

# Set up RL synthesis with PPO
rls = RLSynthesis(env, PPOConfig(), BasicPolicyConfig())

# Train the agent
rls.learn(num_iterations=100, tb_path="runs/my_experiment/")

# Use the trained model to synthesize circuits
import numpy as np
random_permutation = np.random.permutation(9)
optimized_circuit = rls.synth(random_permutation, num_searches=1000)

🏅 Reward and Gate Penalties (at a glance)

Each step returns reward = (1.0 if solved else 0.0) - penalty.
penalty is the weighted increase in cost metrics after the chosen gate: CNOT count, CNOT layers, total layers, and total gates.
Default weights (MetricsWeights) are n_cnots=0.01, n_layers_cnots=0.0, n_layers=0.0, n_gates=0.0001; configure per env via metrics_weights.
Metrics accumulate over the episode; once the target is solved, the positive reward is offset by the penalties from any extra cost incurred.

🤝 Contributing

We welcome contributions! Whether you're adding new synthesis problems, improving RL algorithms, or enhancing documentation - every contribution helps advance quantum computing research.

📝 License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

🧪 Relevant Papers

Kremer, D., Villar, V., Paik, H., Duran, I., Faro, I., & Cruz-Benito, J. (2024). Practical and efficient quantum circuit synthesis and transpiling with reinforcement learning. arXiv preprint arXiv:2405.13196.
Dubal, A., Kremer, D., Martiel, S., Villar, V., Wang, D., & Cruz-Benito, J. (2025). Pauli Network Circuit Synthesis with Reinforcement Learning. arXiv preprint arXiv:2503.14448.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
examples		examples
rust		rust
src/qiskit_gym		src/qiskit_gym
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Qiskit Gym 🏋️‍♀️

🚀 Key Features

🎲 Supported Problem Types

🔄 Permutation Synthesis

🔢 Linear Function Synthesis

🌊 Clifford Synthesis

🎛️ Pauli Network Synthesis

🔭 Roadmap and Vision

🏃‍♂️ Quick Start

🛠️ Installation

💡 Example Usage

🏅 Reward and Gate Penalties (at a glance)

🤝 Contributing

📝 License

🧪 Relevant Papers

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Qiskit Gym 🏋️‍♀️

🚀 Key Features

🎲 Supported Problem Types

🔄 Permutation Synthesis

🔢 Linear Function Synthesis

🌊 Clifford Synthesis

🎛️ Pauli Network Synthesis

🔭 Roadmap and Vision

🏃‍♂️ Quick Start

🛠️ Installation

💡 Example Usage

🏅 Reward and Gate Penalties (at a glance)

🤝 Contributing

📝 License

🧪 Relevant Papers

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages