You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add metrics-based rewards, symmetry permutations, and random inversions (#8)
* Perms in linear functions
* Perms in cliffords and permutations
* Added missing file
* Add gate metrics to lf
* Metrics in cliffords and permutations
* Test inverts in cliffords
* Metrics in envs
* inv in perms
* Perms optional
* Linfunc inversions in rust, not python
* Inverts default to true for permutations
* Update twisterl versions
* Fix a few issues
* Fix permutations/twsit generation performance
* Recover deleted code
* Perms ad__inverts default to true
* Fix inverse solution tracking. Address some comments.
* Remove low value tests
* Delegate model load to twisterl
* Address comments. Solution reset and reward info
- Each step returns `reward = (1.0 if solved else 0.0) - penalty`.
92
+
-`penalty` is the weighted increase in cost metrics after the chosen gate: CNOT count, CNOT layers, total layers, and total gates.
93
+
- Default weights (`MetricsWeights`) are `n_cnots=0.01`, `n_layers_cnots=0.0`, `n_layers=0.0`, `n_gates=0.0001`; configure per env via `metrics_weights`.
94
+
- Metrics accumulate over the episode; once the target is solved, the positive reward is offset by the penalties from any extra cost incurred.
95
+
90
96
## 🤝 Contributing
91
97
92
98
We welcome contributions! Whether you're adding new synthesis problems, improving RL algorithms, or enhancing documentation - every contribution helps advance quantum computing research.
@@ -100,4 +106,4 @@ Licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE.txt) for d
100
106
101
107
- Kremer, D., Villar, V., Paik, H., Duran, I., Faro, I., & Cruz-Benito, J. (2024). Practical and efficient quantum circuit synthesis and transpiling with reinforcement learning. arXiv preprint [arXiv:2405.13196](https://arxiv.org/abs/2405.13196).
102
108
103
-
- Dubal, A., Kremer, D., Martiel, S., Villar, V., Wang, D., & Cruz-Benito, J. (2025). Pauli Network Circuit Synthesis with Reinforcement Learning. arXiv preprint [arXiv:2503.14448](https://arxiv.org/abs/2503.14448).
109
+
- Dubal, A., Kremer, D., Martiel, S., Villar, V., Wang, D., & Cruz-Benito, J. (2025). Pauli Network Circuit Synthesis with Reinforcement Learning. arXiv preprint [arXiv:2503.14448](https://arxiv.org/abs/2503.14448).
0 commit comments