-
-
Notifications
You must be signed in to change notification settings - Fork 21
Add stepwise intermediate reward for RL #526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Shaobo-Zhou
wants to merge
110
commits into
munich-quantum-toolkit:main
Choose a base branch
from
Shaobo-Zhou:new_RL
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+502
−89
Open
Changes from 102 commits
Commits
Show all changes
110 commits
Select commit
Hold shift + click to select a range
129b60f
Update predictor(adding callbacks)
08889bd
Update
e2ff3fe
Restore helper.py and predictor.py to match upstream
1c32d15
Merge remote-tracking branch 'upstream/main'
78dc1aa
Implement new mapping actions
a3ba836
Fix: resolve pre-commit issues and add missing annotations
5935e6f
Fix: resolve pre-commit issues and add missing annotations
f71fb29
Fix: resolve pre-commit issues and add missing annotations
3c7592b
Fix: resolve pre-commit issues and add missing annotations
6db5c27
Fix mypy errors
47841c5
Fix mypy errors
b1ac8ce
Fix dependencies issues
5f8473c
Fix dependency issues
7491ec0
Add missing zip file
3346842
Fix issue with Python 3.13
6f7a73c
Merge branch 'main' into hybrid-mapping
Shaobo-Zhou 6c67349
Remove Python 3.13 from noxfile.py due to compatibility issue
2692b96
Skip minimums session on Windows due to CI slowness
f4874e6
Fix bugs
54eec91
Fix bugs
845f7de
Use default Qiskit settings for VF2Layout and add assertion for nativ…
Shaobo-Zhou 3418936
Debug
Shaobo-Zhou ae870cc
Fix missing argument
Shaobo-Zhou 861bc62
Fix warning issues
Shaobo-Zhou fa989b6
Fix window runtime warning problem
Shaobo-Zhou 405bd39
Fix window runtime warning problem
Shaobo-Zhou 7b2f321
Add time limit for VF2PostLayout
Shaobo-Zhou b67d0a6
Fix windows runtime warning problem
Shaobo-Zhou bf7c9ee
Add new actions
Shaobo-Zhou 6d2733f
Add new actions
Shaobo-Zhou 878185a
Add evaluation code for baseline model
Shaobo-Zhou eae11a2
Set up code for testing new model
Shaobo-Zhou 68306ec
Reset
Shaobo-Zhou fcba8fa
Add new actions
Shaobo-Zhou e5b7518
Fix dependencies
Shaobo-Zhou 907ce2b
Fix dependencies
Shaobo-Zhou 70dcd7d
Fix dependencies
Shaobo-Zhou e0364b1
Merge branch 'main' into new_structure
Shaobo-Zhou eb27098
🎨 pre-commit fixes
pre-commit-ci[bot] 2688c0e
Fix dependencies
Shaobo-Zhou 840d23d
Fix multiprocessing on Python 3.13
Shaobo-Zhou ac406ac
Fix timeout watcher
Shaobo-Zhou 97c81bb
Update comments and restructure
Shaobo-Zhou 470365d
Adjust test circuits from ALG to INDEP
Shaobo-Zhou caaf224
Update max synthesis size for bqskit
Shaobo-Zhou 468da2e
Add tests for more coverage
Shaobo-Zhou 737a9f2
Update tests
Shaobo-Zhou 94c25ab
Update override
Shaobo-Zhou 3ff922e
Update comments
Shaobo-Zhou 782caf8
Update noxfile.py and CHANGELOG.md
Shaobo-Zhou 44e0e40
Clean up venv after each session to free up space
Shaobo-Zhou 350bae5
Clean up venv after each session to free up space
Shaobo-Zhou 91208d1
Clean up venv after each session to free up space
Shaobo-Zhou 622d409
Clean up venv after each session to free up space
Shaobo-Zhou fa008a6
Update noxfile
Shaobo-Zhou 0c98e56
Update ibm runtime dependency
Shaobo-Zhou 34a1c7c
Update noxfile
Shaobo-Zhou a8d069a
Update noxfile
Shaobo-Zhou 114b79b
Update noxfile
Shaobo-Zhou aaf14b1
Fetch update from main
Shaobo-Zhou e39cd7e
Merge remote-tracking branch 'upstream/main' into new_structure
Shaobo-Zhou e7b4174
Fix ruff checks
Shaobo-Zhou fcab4fa
Update action space and add normalized gate counts as RL features
Shaobo-Zhou cb4d0fb
Fix FOM comparison logic
Shaobo-Zhou 988a8f7
Remove 3-qubit gates from dict
Shaobo-Zhou 9a40b3e
Add reward shaping
Shaobo-Zhou cee79cd
Add fallback for unsupported reward function
Shaobo-Zhou 756d777
Minor Fixes
Shaobo-Zhou d60d17e
Minor Fixes
Shaobo-Zhou a05e37a
Merge branch 'main' into new_RL
Shaobo-Zhou 85d4d57
Minor Fixes
Shaobo-Zhou a1370de
Fix predictorenv.py
Shaobo-Zhou e456f43
Update changelog and improve test coverage
Shaobo-Zhou de7c498
Update cost model
Shaobo-Zhou 9a68c59
Merge branch 'main' into new_RL
Shaobo-Zhou d0a07a2
Update cost model
Shaobo-Zhou 18cf31f
Fix CI issue
Shaobo-Zhou 0c42b41
Improve coverage
Shaobo-Zhou a451710
Update src/mqt/predictor/reward.py
Shaobo-Zhou ac7dc28
Update src/mqt/predictor/rl/cost_model.py
Shaobo-Zhou 67dc402
Update tests/hellinger_distance/test_estimated_hellinger_distance.py
Shaobo-Zhou 8259b4d
Code improvements suggested by CodeRabbit
Shaobo-Zhou 4b5bf6f
Fix warnings
Shaobo-Zhou 14aa3ac
Improve coverage
Shaobo-Zhou 04fde70
Improve coverage
Shaobo-Zhou 1df8071
Update src/mqt/predictor/rl/predictorenv.py
Shaobo-Zhou 6a104b8
Fixes
Shaobo-Zhou ae43600
Merge branch 'main' into new_RL
Shaobo-Zhou b91f4f4
Fixes
Shaobo-Zhou b2197b3
Fix format
Shaobo-Zhou 331972a
Fixes
Shaobo-Zhou b8a80d2
Fixes
Shaobo-Zhou 7810d4d
Update comments
Shaobo-Zhou 8b3ecaf
Merge branch 'main' into new_RL
burgholzer bc49c63
✏️ curate changelog
burgholzer 5cd1bf7
✏️ minimize unnecessary whitespace changes
burgholzer 5a895d0
⏪ revert Windows workaround
burgholzer 1ecb3e2
🏷️ Various typing fixes
burgholzer dde6f89
⏪ avoid CUDA out of memory error
burgholzer 2e3d1c7
Update cost_model.py to QCEC style
Shaobo-Zhou c8b7201
Merge branch 'main' into new_RL
Shaobo-Zhou da3c95f
🎨 pre-commit fixes
pre-commit-ci[bot] 56573a9
Adjusted implementation of reward approximation
Shaobo-Zhou e0cec3c
Coderabbit suggestions
Shaobo-Zhou a41b323
Resolve problem with estimated Hellinger distance
Shaobo-Zhou 37c55a8
Merge branch 'main' into new_RL
Shaobo-Zhou 734df4e
Add test coverage
Shaobo-Zhou d376268
Apply coderabbit suggestions
Shaobo-Zhou adaefa9
Apply coderabbit suggestions
Shaobo-Zhou 7a0c40b
Apply coderabbit suggestions
Shaobo-Zhou File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,280 @@ | ||
| # Copyright (c) 2023 - 2026 Chair for Design Automation, TUM | ||
| # Copyright (c) 2025 - 2026 Munich Quantum Software Company GmbH | ||
| # All rights reserved. | ||
| # | ||
| # SPDX-License-Identifier: MIT | ||
| # | ||
| # Licensed under the MIT License | ||
|
|
||
| """Helper functions for approximating transformations to device-native gates. | ||
|
|
||
| This module provides a dynamic canonical gate cost model and approximate | ||
| fidelity/ESP estimates based on averaged 1q/2q error rates. | ||
|
|
||
| For each backend, a cost table of gate decompositions into the native gate set | ||
| is generated programmatically (and cached). This avoids rigid hard-coding of | ||
| costs. If a backend is unknown, a default basis (IBM Qiskit basis) is used as | ||
| a fallback with a warning, or users can extend the known device basis list. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import warnings | ||
| from typing import cast | ||
|
|
||
| import numpy as np | ||
|
|
||
| # Attempt to import Qiskit for transpilation | ||
| from qiskit import QuantumCircuit, transpile | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| CanonicalCostTable = dict[str, tuple[int, int]] | ||
|
|
||
| # Cache for generated cost tables | ||
| DEVICE_COST_CACHE: dict[str, dict[str, tuple[int, int]]] = {} | ||
|
|
||
| # Pre-defined native gate sets for known devices (can be extended) | ||
| KNOWN_DEVICE_BASES: dict[str, list[str]] = { | ||
| "ibm_torino": [ | ||
| "id", | ||
| "rz", | ||
| "rx", | ||
| "sx", | ||
| "x", | ||
| "cz", | ||
| "rzz", | ||
| ], # IBM device example (native 1q: id/rz/rx/sx/x, 2q: cz, rzz) | ||
| "ankaa_3": ["id", "rz", "rx", "iswap"], # Rigetti Ankaa-3 (native 1q: rx, rz; native 2q: iSWAP) | ||
| "emerald": [ | ||
| "id", | ||
| "rz", | ||
| "rx", | ||
| "cz", | ||
| "u", | ||
| ], # IQM Emerald (native 1q: arbitrary single-qubit rotation 'u'; native 2q: cz) | ||
| # Additional devices can be added here... | ||
| } | ||
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Heuristic set of known two-qubit basis gate names used to map averages to per-basis values. | ||
| # This is intentionally conservative; device-specific basis sets should be used when available. | ||
| TWO_Q_GATES: set[str] = { | ||
| "cx", | ||
| "cz", | ||
| "iswap", | ||
| "rzz", | ||
| "rxx", | ||
| "ryy", | ||
| "rzx", | ||
| "dcx", | ||
| "ecr", | ||
| "swap", | ||
| } | ||
|
|
||
|
|
||
| def build_error_rates_from_averages(device_id: str, p1_avg: float, p2_avg: float) -> dict[str, float]: | ||
| """Construct a per-basis error rate mapping from averaged 1q/2q values. | ||
|
|
||
| This uses a simple heuristic to decide whether a basis gate is 1q or 2q. | ||
| """ | ||
| basis_gates = KNOWN_DEVICE_BASES.get(device_id, ["id", "rz", "sx", "x", "cx"]) | ||
| error_rates: dict[str, float] = {} | ||
| for g in basis_gates: | ||
| error_rates[g] = p2_avg if g in TWO_Q_GATES else p1_avg | ||
| return error_rates | ||
|
|
||
|
|
||
| def build_gate_durations_from_averages(device_id: str, tau1_avg: float, tau2_avg: float) -> dict[str, float]: | ||
| """Construct a per-basis gate duration mapping from averaged 1q/2q durations.""" | ||
| basis_gates = KNOWN_DEVICE_BASES.get(device_id, ["id", "rz", "sx", "x", "cx"]) | ||
| durations: dict[str, float] = {} | ||
| for g in basis_gates: | ||
| durations[g] = tau2_avg if g in TWO_Q_GATES else tau1_avg | ||
| return durations | ||
|
|
||
|
|
||
| def generate_cost_table(device_id: str) -> dict[str, tuple[int, int]]: | ||
| """Generate a canonical gate cost table for the given device_id. | ||
|
|
||
| This function programmatically derives the (n_1q, n_2q) costs for common gates | ||
| by decomposing them into the device's native gate set via Qiskit transpilation. | ||
| If the device_id is not recognized in KNOWN_DEVICE_BASES, a generic basis | ||
| is assumed (using IBM's basis as a fallback) and a warning is emitted. | ||
| """ | ||
| if transpile is None or QuantumCircuit is None: | ||
| msg = "Qiskit is required to generate cost tables dynamically." | ||
| raise ImportError(msg) | ||
coderabbitai[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Determine the basis gates for this device | ||
| basis_gates = KNOWN_DEVICE_BASES.get(device_id) | ||
| if basis_gates is None: | ||
| warnings.warn( | ||
| f"No native gate-set defined for device '{device_id}'. " | ||
| "Generating cost table using a minimal universal basis (Qiskit default). " | ||
| "Results may be inaccurate. Consider specifying the gate set in KNOWN_DEVICE_BASES.", | ||
| UserWarning, | ||
| stacklevel=2, | ||
| ) | ||
| logger.warning(f"No basis for device '{device_id}', using minimal universal basis for cost generation.") | ||
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| # Default to minimal universal basis (Qiskit default) | ||
| basis_gates = ["id", "rz", "sx", "x", "cx"] | ||
|
|
||
| cost_table: dict[str, tuple[int, int]] = {} | ||
|
|
||
| # Structured gate definitions for dynamic profiling | ||
| gate_profiles = [ | ||
| # Single-qubit gates (no params) | ||
| { | ||
| "gates": ["id", "x", "y", "z", "h", "s", "sdg", "t", "tdg", "sx", "sxdg", "u0"], | ||
| "qubits": 1, | ||
| "params": 0, | ||
| "controls": 0, | ||
| }, | ||
| # Single-qubit gates (1 param) | ||
| {"gates": ["p", "rx", "ry", "rz", "u1", "r"], "qubits": 1, "params": 1, "controls": 0}, | ||
| # Single-qubit gates (2 params) | ||
| {"gates": ["u2"], "qubits": 1, "params": 2, "controls": 0}, | ||
| # Single-qubit gates (3 params) | ||
| {"gates": ["u", "u3"], "qubits": 1, "params": 3, "controls": 0}, | ||
| # Two-qubit gates (no params) | ||
| { | ||
| "gates": ["cx", "cy", "cz", "ch", "csx", "swap", "iswap", "dcx", "ecr"], | ||
| "qubits": 2, | ||
| "params": 0, | ||
| "controls": 0, | ||
| }, | ||
| # Two-qubit gates (1 param) | ||
| {"gates": ["rxx", "ryy", "rzz", "rzx", "cu1", "cp"], "qubits": 2, "params": 1, "controls": 0}, | ||
| # Controlled single-qubit gates (1 param) | ||
| {"gates": ["crx", "cry", "crz"], "qubits": 1, "params": 1, "controls": 1}, | ||
| # Controlled U3 (3 params) | ||
| {"gates": ["cu3", "cu"], "qubits": 1, "params": 3, "controls": 1}, | ||
| # Multi-qubit gates (no params) | ||
| {"gates": ["ccx", "cswap", "rccx", "rc3x", "c3x", "c3sqrtx", "c4x"], "qubits": 1, "params": 0, "controls": 2}, | ||
| ] | ||
|
|
||
| def add_gate_to_cost_table(gate: str, qubits: int, params: int, controls: int) -> None: | ||
| total_qubits = qubits + controls | ||
| qc = QuantumCircuit(total_qubits) | ||
| try: | ||
| gate_name = "c" * controls + gate if controls > 0 else gate | ||
| if params == 0: | ||
| getattr(qc, gate_name)(*list(range(total_qubits))) | ||
| else: | ||
| param_values = list(range(1, params + 1)) | ||
| getattr(qc, gate_name)(*param_values, *list(range(total_qubits))) | ||
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| except Exception: | ||
| return # skip if not available | ||
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| # For reference: store the transpiled circuit size (total basis gate count) | ||
| qc_trans = transpile(qc, basis_gates=basis_gates, optimization_level=1, seed_transpiler=42) | ||
| cost_table[gate if controls == 0 else ("c" * controls) + gate] = (qc_trans.size(), 0) | ||
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Shaobo-Zhou marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| for profile in gate_profiles: | ||
| gates = cast("list[str]", profile["gates"]) | ||
| qubits = int(cast("int", profile["qubits"])) | ||
| params = int(cast("int", profile.get("params", 0))) | ||
| controls = int(cast("int", profile.get("controls", 0))) | ||
| for gate in gates: | ||
| add_gate_to_cost_table(gate, qubits, params, controls) | ||
|
|
||
| # Ensure 'id' is treated as no-op (if not already in table due to optimization removal) | ||
| cost_table["id"] = (0, 0) | ||
|
|
||
| return cost_table | ||
|
|
||
|
|
||
| def get_cost_table(device_id: str) -> CanonicalCostTable: | ||
| """Return the canonical cost table for `device_id`, generating it if necessary. | ||
|
|
||
| If the device is unknown (not predefined), the cost table is generated using a | ||
| default basis and a warning is emitted to indicate a potential inaccuracy. | ||
| The result is cached to avoid repeated computation. | ||
| """ | ||
| if device_id not in DEVICE_COST_CACHE: | ||
| # Generate and cache the cost table for this device | ||
| DEVICE_COST_CACHE[device_id] = generate_cost_table(device_id) | ||
| return DEVICE_COST_CACHE[device_id] | ||
|
|
||
|
|
||
| # --- Helper: Estimate basis gate counts in a transpiled circuit --- | ||
| def estimate_basis_gate_counts(qc: QuantumCircuit, *, basis_gates: list[str]) -> dict[str, int]: | ||
| """Estimate the count of each basis gate in the transpiled circuit.""" | ||
| qc_trans = transpile(qc, basis_gates=basis_gates, optimization_level=1, seed_transpiler=42) | ||
| gate_counts = dict.fromkeys(basis_gates, 0) | ||
| for instr, _, _ in qc_trans.data: | ||
| name = instr.name | ||
| if name in gate_counts: | ||
| gate_counts[name] += 1 | ||
| return gate_counts | ||
|
|
||
|
|
||
| def approx_expected_fidelity( | ||
| qc: QuantumCircuit, | ||
| error_rates: dict[str, float], | ||
| *, | ||
| device_id: str = "ibm_torino", | ||
| ) -> float: | ||
| """Estimate expected fidelity using per-basis-gate error rates. | ||
|
|
||
| Args: | ||
| qc: QuantumCircuit to analyze | ||
| error_rates: dict mapping basis gate name to error rate (e.g., {"cx": 0.01, "rz": 0.001, ...}) | ||
| device_id: device identifier for basis gates | ||
| Returns: | ||
| Estimated total fidelity as a float in [0, 1] | ||
| """ | ||
| basis_gates = KNOWN_DEVICE_BASES.get(device_id, ["id", "rz", "sx", "x", "cx"]) | ||
| gate_counts = estimate_basis_gate_counts(qc, basis_gates=basis_gates) | ||
| fidelity = 1.0 | ||
| for gate, count in gate_counts.items(): | ||
| p = error_rates.get(gate, 0.0) | ||
| fidelity *= (1.0 - p) ** count | ||
| return float(max(min(fidelity, 1.0), 0.0)) | ||
|
|
||
|
|
||
| def approx_estimated_success_probability( | ||
| qc: QuantumCircuit, | ||
| error_rates: dict[str, float], | ||
| gate_durations: dict[str, float], | ||
| tbar: float | None, | ||
| par_feature: float, | ||
| liv_feature: float, | ||
| n_qubits: int, | ||
| *, | ||
| device_id: str = "ibm_torino", | ||
| ) -> float: | ||
| """Estimate the Estimated Success Probability (ESP) using per-basis-gate error rates and durations. | ||
|
|
||
| Args: | ||
| qc: QuantumCircuit to analyze | ||
| error_rates: dict mapping basis gate name to error rate | ||
| gate_durations: dict mapping basis gate name to average duration (in same units as tbar) | ||
| tbar: average T1/T2 time (decoherence time) | ||
| par_feature: parallelism feature (0=serial, 1=fully parallel) | ||
| liv_feature: liveness feature (fraction of time qubits are active) | ||
| n_qubits: number of qubits in the circuit | ||
| device_id: device identifier for basis gates | ||
| Returns: | ||
| Estimated ESP as a float in [0, 1] | ||
| """ | ||
| basis_gates = KNOWN_DEVICE_BASES.get(device_id, ["id", "rz", "sx", "x", "cx"]) | ||
| gate_counts = estimate_basis_gate_counts(qc, basis_gates=basis_gates) | ||
| # Fidelity from gate operations | ||
| f_gate = 1.0 | ||
| for gate, count in gate_counts.items(): | ||
| p = error_rates.get(gate, 0.0) | ||
| f_gate *= (1.0 - p) ** count | ||
|
|
||
| # Estimate effective circuit duration based on parallelism | ||
| n_q = max(n_qubits, 1) | ||
| k_eff = 1.0 + (n_q - 1.0) * float(par_feature) | ||
| # Total gate time: sum over all basis gates | ||
| total_gate_time = sum(gate_counts[g] * gate_durations.get(g, 0.0) for g in basis_gates) / k_eff | ||
|
|
||
| # Idle time penalty factor based on liveness | ||
| idle_fraction = max(0.0, 1.0 - float(liv_feature)) | ||
| idle_factor = 1.0 if tbar is None or tbar <= 0.0 else float(np.exp(-(total_gate_time * idle_fraction) / tbar)) | ||
|
|
||
| esp = f_gate * idle_factor | ||
| return float(max(min(esp, 1.0), 0.0)) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.