-
-
Notifications
You must be signed in to change notification settings - Fork 21
Add stepwise intermediate reward for RL #526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Shaobo-Zhou
wants to merge
110
commits into
munich-quantum-toolkit:main
Choose a base branch
from
Shaobo-Zhou:new_RL
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
110 commits
Select commit
Hold shift + click to select a range
129b60f
Update predictor(adding callbacks)
08889bd
Update
e2ff3fe
Restore helper.py and predictor.py to match upstream
1c32d15
Merge remote-tracking branch 'upstream/main'
78dc1aa
Implement new mapping actions
a3ba836
Fix: resolve pre-commit issues and add missing annotations
5935e6f
Fix: resolve pre-commit issues and add missing annotations
f71fb29
Fix: resolve pre-commit issues and add missing annotations
3c7592b
Fix: resolve pre-commit issues and add missing annotations
6db5c27
Fix mypy errors
47841c5
Fix mypy errors
b1ac8ce
Fix dependencies issues
5f8473c
Fix dependency issues
7491ec0
Add missing zip file
3346842
Fix issue with Python 3.13
6f7a73c
Merge branch 'main' into hybrid-mapping
Shaobo-Zhou 6c67349
Remove Python 3.13 from noxfile.py due to compatibility issue
2692b96
Skip minimums session on Windows due to CI slowness
f4874e6
Fix bugs
54eec91
Fix bugs
845f7de
Use default Qiskit settings for VF2Layout and add assertion for nativ…
Shaobo-Zhou 3418936
Debug
Shaobo-Zhou ae870cc
Fix missing argument
Shaobo-Zhou 861bc62
Fix warning issues
Shaobo-Zhou fa989b6
Fix window runtime warning problem
Shaobo-Zhou 405bd39
Fix window runtime warning problem
Shaobo-Zhou 7b2f321
Add time limit for VF2PostLayout
Shaobo-Zhou b67d0a6
Fix windows runtime warning problem
Shaobo-Zhou bf7c9ee
Add new actions
Shaobo-Zhou 6d2733f
Add new actions
Shaobo-Zhou 878185a
Add evaluation code for baseline model
Shaobo-Zhou eae11a2
Set up code for testing new model
Shaobo-Zhou 68306ec
Reset
Shaobo-Zhou fcba8fa
Add new actions
Shaobo-Zhou e5b7518
Fix dependencies
Shaobo-Zhou 907ce2b
Fix dependencies
Shaobo-Zhou 70dcd7d
Fix dependencies
Shaobo-Zhou e0364b1
Merge branch 'main' into new_structure
Shaobo-Zhou eb27098
🎨 pre-commit fixes
pre-commit-ci[bot] 2688c0e
Fix dependencies
Shaobo-Zhou 840d23d
Fix multiprocessing on Python 3.13
Shaobo-Zhou ac406ac
Fix timeout watcher
Shaobo-Zhou 97c81bb
Update comments and restructure
Shaobo-Zhou 470365d
Adjust test circuits from ALG to INDEP
Shaobo-Zhou caaf224
Update max synthesis size for bqskit
Shaobo-Zhou 468da2e
Add tests for more coverage
Shaobo-Zhou 737a9f2
Update tests
Shaobo-Zhou 94c25ab
Update override
Shaobo-Zhou 3ff922e
Update comments
Shaobo-Zhou 782caf8
Update noxfile.py and CHANGELOG.md
Shaobo-Zhou 44e0e40
Clean up venv after each session to free up space
Shaobo-Zhou 350bae5
Clean up venv after each session to free up space
Shaobo-Zhou 91208d1
Clean up venv after each session to free up space
Shaobo-Zhou 622d409
Clean up venv after each session to free up space
Shaobo-Zhou fa008a6
Update noxfile
Shaobo-Zhou 0c98e56
Update ibm runtime dependency
Shaobo-Zhou 34a1c7c
Update noxfile
Shaobo-Zhou a8d069a
Update noxfile
Shaobo-Zhou 114b79b
Update noxfile
Shaobo-Zhou aaf14b1
Fetch update from main
Shaobo-Zhou e39cd7e
Merge remote-tracking branch 'upstream/main' into new_structure
Shaobo-Zhou e7b4174
Fix ruff checks
Shaobo-Zhou fcab4fa
Update action space and add normalized gate counts as RL features
Shaobo-Zhou cb4d0fb
Fix FOM comparison logic
Shaobo-Zhou 988a8f7
Remove 3-qubit gates from dict
Shaobo-Zhou 9a40b3e
Add reward shaping
Shaobo-Zhou cee79cd
Add fallback for unsupported reward function
Shaobo-Zhou 756d777
Minor Fixes
Shaobo-Zhou d60d17e
Minor Fixes
Shaobo-Zhou a05e37a
Merge branch 'main' into new_RL
Shaobo-Zhou 85d4d57
Minor Fixes
Shaobo-Zhou a1370de
Fix predictorenv.py
Shaobo-Zhou e456f43
Update changelog and improve test coverage
Shaobo-Zhou de7c498
Update cost model
Shaobo-Zhou 9a68c59
Merge branch 'main' into new_RL
Shaobo-Zhou d0a07a2
Update cost model
Shaobo-Zhou 18cf31f
Fix CI issue
Shaobo-Zhou 0c42b41
Improve coverage
Shaobo-Zhou a451710
Update src/mqt/predictor/reward.py
Shaobo-Zhou ac7dc28
Update src/mqt/predictor/rl/cost_model.py
Shaobo-Zhou 67dc402
Update tests/hellinger_distance/test_estimated_hellinger_distance.py
Shaobo-Zhou 8259b4d
Code improvements suggested by CodeRabbit
Shaobo-Zhou 4b5bf6f
Fix warnings
Shaobo-Zhou 14aa3ac
Improve coverage
Shaobo-Zhou 04fde70
Improve coverage
Shaobo-Zhou 1df8071
Update src/mqt/predictor/rl/predictorenv.py
Shaobo-Zhou 6a104b8
Fixes
Shaobo-Zhou ae43600
Merge branch 'main' into new_RL
Shaobo-Zhou b91f4f4
Fixes
Shaobo-Zhou b2197b3
Fix format
Shaobo-Zhou 331972a
Fixes
Shaobo-Zhou b8a80d2
Fixes
Shaobo-Zhou 7810d4d
Update comments
Shaobo-Zhou 8b3ecaf
Merge branch 'main' into new_RL
burgholzer bc49c63
✏️ curate changelog
burgholzer 5cd1bf7
✏️ minimize unnecessary whitespace changes
burgholzer 5a895d0
⏪ revert Windows workaround
burgholzer 1ecb3e2
🏷️ Various typing fixes
burgholzer dde6f89
⏪ avoid CUDA out of memory error
burgholzer 2e3d1c7
Update cost_model.py to QCEC style
Shaobo-Zhou c8b7201
Merge branch 'main' into new_RL
Shaobo-Zhou da3c95f
🎨 pre-commit fixes
pre-commit-ci[bot] 56573a9
Adjusted implementation of reward approximation
Shaobo-Zhou e0cec3c
Coderabbit suggestions
Shaobo-Zhou a41b323
Resolve problem with estimated Hellinger distance
Shaobo-Zhou 37c55a8
Merge branch 'main' into new_RL
Shaobo-Zhou 734df4e
Add test coverage
Shaobo-Zhou d376268
Apply coderabbit suggestions
Shaobo-Zhou adaefa9
Apply coderabbit suggestions
Shaobo-Zhou 7a0c40b
Apply coderabbit suggestions
Shaobo-Zhou File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,116 @@ | ||
| # Copyright (c) 2023 - 2026 Chair for Design Automation, TUM | ||
| # Copyright (c) 2025 - 2026 Munich Quantum Software Company GmbH | ||
| # All rights reserved. | ||
| # | ||
| # SPDX-License-Identifier: MIT | ||
| # | ||
| # Licensed under the MIT License | ||
|
|
||
| """This module provides helper functions to approximate expected fidelity and estimated success probability (ESP) by transpiling a circuit to a device's basis gate set and combining resulting gate counts with calibration-derived per-gate error rates and durations.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from typing import TYPE_CHECKING | ||
|
|
||
| import numpy as np | ||
| from qiskit import transpile | ||
|
|
||
| if TYPE_CHECKING: | ||
| from qiskit import QuantumCircuit | ||
| from qiskit.transpiler import Target | ||
|
|
||
| BLACKLIST: set[str] = {"measure", "reset", "delay", "barrier"} # These gates do not directly contribute to the error | ||
|
|
||
|
|
||
| def get_basis_gates_from_target(device: Target) -> list[str]: | ||
| """Return the basis gate names from a Qiskit Target.""" | ||
| return sorted([g for g in device.operation_names if g not in BLACKLIST]) | ||
|
|
||
|
|
||
| def estimate_basis_gate_counts(qc: QuantumCircuit, *, basis_gates: list[str]) -> dict[str, int]: | ||
| """Transpile ``qc`` to ``basis_gates`` and count occurrences of each basis gate.""" | ||
| qc_t = transpile(qc, basis_gates=basis_gates, optimization_level=1, seed_transpiler=42) | ||
| counts = dict.fromkeys(basis_gates, 0) | ||
| for ci in qc_t.data: | ||
| name = ci.operation.name | ||
| if name in BLACKLIST: | ||
| continue | ||
| if name in counts: | ||
| counts[name] += 1 | ||
| return counts | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| def approx_expected_fidelity( | ||
| qc: QuantumCircuit, | ||
| *, | ||
| device: Target, | ||
| error_rates: dict[str, float], | ||
| ) -> float: | ||
| """Approximate expected fidelity using per-basis-gate error rates. | ||
|
|
||
| The circuit is first transpiled to the device basis. Then a simple product | ||
| model is applied: Π_g (1 - p_g)^{count_g}. | ||
|
|
||
| Args: | ||
| qc: Circuit to evaluate. | ||
| device: Target providing the basis gate set. | ||
| error_rates: Mapping ``basis_gate -> error_probability``. | ||
|
|
||
| Returns: | ||
| Approximate fidelity in [0, 1]. | ||
| """ | ||
| basis = get_basis_gates_from_target(device) | ||
| counts = estimate_basis_gate_counts(qc, basis_gates=basis) | ||
Shaobo-Zhou marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| f = 1.0 | ||
| for g, c in counts.items(): | ||
| f *= (1.0 - error_rates.get(g, 0.0)) ** c | ||
| return float(max(min(f, 1.0), 0.0)) | ||
|
|
||
|
|
||
| def approx_estimated_success_probability( | ||
| qc: QuantumCircuit, | ||
| *, | ||
| device: Target, | ||
| error_rates: dict[str, float], | ||
| gate_durations: dict[str, float], | ||
| tbar: float | None, | ||
| par_feature: float, | ||
| liv_feature: float, | ||
| n_qubits: int, | ||
| ) -> float: | ||
| """Approximate ESP using per-basis-gate error rates, durations, and coherence. | ||
|
|
||
| This combines: | ||
| (1) a gate-infidelity product term, and | ||
| (2) an idle/decoherence penalty based on an effective circuit duration. | ||
|
|
||
| Args: | ||
| qc: Circuit to evaluate. | ||
| device: Target providing the basis gate set. | ||
| error_rates: Mapping ``basis_gate -> error_probability``. | ||
| gate_durations: Mapping ``basis_gate -> duration`` (seconds). | ||
| tbar: Representative coherence time (seconds). If None, idle penalty is skipped. | ||
| par_feature: Parallelism feature in [0, 1]. | ||
| liv_feature: Liveness feature in [0, 1]. | ||
| n_qubits: Number of qubits in the circuit. | ||
|
|
||
| Returns: | ||
| Approximate ESP in [0, 1]. | ||
| """ | ||
| basis = get_basis_gates_from_target(device) | ||
| counts = estimate_basis_gate_counts(qc, basis_gates=basis) | ||
|
|
||
| f_gate = 1.0 | ||
| for g, c in counts.items(): | ||
| f_gate *= (1.0 - error_rates.get(g, 0.0)) ** c | ||
|
|
||
| n_q = max(n_qubits, 1) | ||
| k_eff = 1.0 + (n_q - 1.0) * float(par_feature) | ||
|
|
||
| total_gate_time = sum(counts[g] * gate_durations.get(g, 0.0) for g in basis) / k_eff | ||
|
|
||
| idle_fraction = max(0.0, 1.0 - float(liv_feature)) | ||
| idle_factor = 1.0 if tbar is None or tbar <= 0.0 else float(np.exp(-(total_gate_time * idle_fraction) / tbar)) | ||
|
|
||
| esp = f_gate * idle_factor | ||
| return float(max(min(esp, 1.0), 0.0)) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.