Add metrics-based rewards, symmetry permutations, and random inversions by victor-villar · Pull Request #8 · AI4quantum/qiskit-gym

victor-villar · 2025-12-05T12:25:11Z

Adds metrics-based reward shaping, symmetry permutations, and optional random inversions to all three gym environments. Updates twisterl to 0.3.0.

Metrics tracking: CNOTs, layers, gates with configurable penalty weights
Symmetries: Auto-computed observation/action permutations for data augmentation
Inversions: Optional random state inversions during training
Backward compatible: New parameters are optional, being true by default.

cbjuan

Few comments, review made in collaboration with an LLM to fill part of the gaps I have in my Rust knowledge

rust/src/envs/linear_function.rs

cbjuan · 2025-12-11T23:48:19Z

rust/src/envs/linear_function.rs

+        env.depth = env.max_depth;
+
+        env.step(0);
+        assert!(!env.solved());


With add_inverts=true, each step() has 50% chance of inverting the state. The assertions can randomly fail, right?

It works because we create and empty circuit, add a CX(0,1) and then another CX(0,1) which basically "solves" the circuit. The inverted circuit is the same. In any case these were small tests do not add much value so I've removed them for now.

cbjuan · 2025-12-11T23:48:36Z

rust/src/envs/linear_function.rs

+        assert!(!env.solved());
+
+        env.step(0);
+        assert!(env.solved());


Same as before

cbjuan · 2025-12-11T23:49:20Z

rust/src/envs/permutation.rs

-            if self.depth == 0 { -0.5 } else { -0.5/(self.max_depth as f32) }
-        }
-    }
+    fn reward(&self) -> f32 { self.reward_value }


This should be documented as it's a potential breaking change if the library is being used

hmm, you mean the way we compute the rewards?

rust/src/envs/permutation.rs

cbjuan · 2025-12-11T23:51:52Z

rust/src/envs/clifford.rs

+    fn twists(&self) -> (Vec<Vec<usize>>, Vec<Vec<usize>>) {
+        (self.obs_perms.clone(), self.act_perms.clone())
+    }


If called frequently, this is wasteful. Consider returning &[Vec<usize>] or caching.

cbjuan

It looks much better than before. A couple of things

I found an issue with identation (added in the review comments)
Also, the solution is not cleared on reset: The solution and solution_inv vectors are not cleared in reset(). This means if you reset and run again, the old solution data remains. Is this expected? It affects to clifford.rs, permutation.rs and linear_function.rs

cbjuan · 2026-01-07T10:29:53Z

rust/src/envs/linear_function.rs

+        let mut penalty = 0.0f32;
+
+         if let Some(gate) = self.gateset.get(action).cloned() {
+            let previous = self.metrics_values.clone();
+            self.metrics.apply_gate(&gate);
+            let new_metrics = self.metrics.snapshot();
+            penalty = new_metrics.weighted_delta(&previous, &self.metrics_weights);
+            self.metrics_values = new_metrics;
+
+            self.apply_gate_to_state(&gate);
+        }


There is an indentation issue in line 302

Suggested change

let mut penalty = 0.0f32;

if let Some(gate) = self.gateset.get(action).cloned() {

let previous = self.metrics_values.clone();

self.metrics.apply_gate(&gate);

let new_metrics = self.metrics.snapshot();

penalty = new_metrics.weighted_delta(&previous, &self.metrics_weights);

self.metrics_values = new_metrics;

self.apply_gate_to_state(&gate);

}

let mut penalty = 0.0f32;

if let Some(gate) = self.gateset.get(action).cloned() {

let previous = self.metrics_values.clone();

self.metrics.apply_gate(&gate);

let new_metrics = self.metrics.snapshot();

penalty = new_metrics.weighted_delta(&previous, &self.metrics_weights);

self.metrics_values = new_metrics;

self.apply_gate_to_state(&gate);

}

victor-villar · 2026-01-07T11:15:18Z

Also, the solution is not cleared on reset: The solution and solution_inv vectors are not cleared in reset(). This means if you reset and run again, the old solution data remains. Is this expected? It affects to clifford.rs, permutation.rs and linear_function.rs

It does not affect functionality on training, where we use reset, but it could definitely be a bug in inference in case someone reuses the env and set a new state with set_state. I'll fix in in both functions, good catch.

cbjuan

Thank you!!

victor-villar added 16 commits November 5, 2025 11:14

Perms in linear functions

f200cbc

Perms in cliffords and permutations

8634e23

Added missing file

f16f26e

Add gate metrics to lf

763c44f

Metrics in cliffords and permutations

6f91d95

Test inverts in cliffords

c312e0f

Metrics in envs

b3929ac

inv in perms

aaf52f3

Perms optional

737b690

Linfunc inversions in rust, not python

e997e11

Inverts default to true for permutations

f40bf68

Update twisterl versions

073d981

Fix a few issues

e7c4a5f

Fix permutations/twsit generation performance

c79045f

Recover deleted code

ad63f3c

Perms ad__inverts default to true

87dff2b

victor-villar mentioned this pull request Dec 9, 2025

Add permutatios to clifford, linear_function and permutations envs #7

Closed

cbjuan reviewed Dec 11, 2025

View reviewed changes

victor-villar added 3 commits December 22, 2025 10:35

Fix inverse solution tracking. Address some comments.

e0e956d

Remove low value tests

4bdbb13

Delegate model load to twisterl

db7e72f

cbjuan reviewed Jan 7, 2026

View reviewed changes

Address comments. Solution reset and reward info

3254254

cbjuan approved these changes Jan 7, 2026

View reviewed changes

victor-villar merged commit 98688a5 into main Jan 8, 2026

Conversation

victor-villar commented Dec 5, 2025

Uh oh!

cbjuan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cbjuan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

victor-villar commented Jan 7, 2026

Uh oh!

cbjuan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants