Add bank-invariant solver experiment with differential features by Copilot · Pull Request #5 · zfifteen/shape-budget

Copilot · 2026-03-25T08:23:21Z

The bank-adaptive solver cleared holdout but failed fresh-bank confirmation because its ridge chooser uses absolute score features (support_score, joint_score, etc.) that shift with the reference bank, breaking frozen routing on new banks.

This experiment replaces absolute features with bank-invariant differentials to test whether removing the bank-dependent baseline from the feature vector is sufficient for confirmation stability.

Design

Differential features only: score_diff, entropy_diff, cv_score_diff, log_alpha_diff, t_diff, rho_diff, h_diff, w1_diff, w2_diff — all joint-minus-support. First-order invariant to additive bank shifts.
Cell one-hot retained for support-type conditioning (6 cells: 2 conditions × 3 skew bins).
Same evaluation ladder as bank-adaptive: disjoint calibration → frozen chooser → holdout → confirmation. One density fallback branch.

Results (baseline, bank_size=300)

Split	Support	Joint	Chooser
Calibration	0.1862	0.1596	0.1403	beats both
Holdout	0.1273	0.1180	0.1152	beats both
Confirmation	0.1319	0.1773	0.1674	beats joint, loses to support

The bank-invariant baseline clears holdout (the bank-adaptive baseline could not — it lost to joint at holdout). Confirmation still fails. The differential redesign reduces bank sensitivity but does not eliminate it.

Files

experiments/pose-anisotropy-interventions/bank-invariant-solver/run.py — experiment driver
experiments/pose-anisotropy-interventions/bank-invariant-solver/README.md — writeup with results
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/ — cache, models, reports
experiments/README.md — index updated with new experiment link

Co-authored-by: zfifteen <221906715+zfifteen@users.noreply.github.com> Agent-Logs-Url: https://github.com/zfifteen/shape-budget/sessions/36dc4936-e048-421a-a7c5-b32763e92cf7

…results Co-authored-by: zfifteen <221906715+zfifteen@users.noreply.github.com> Agent-Logs-Url: https://github.com/zfifteen/shape-budget/sessions/fac03558-583b-4ac6-ab01-23010a56fc36

Copilot

Pull request overview

Adds a new “bank-invariant solver” experiment to the pose-anisotropy interventions suite to test whether routing stability across fresh banks improves when the ridge chooser uses joint-minus-support differential features instead of absolute, bank-shifting score features.

Changes:

Introduces a new experiment driver (run.py) implementing differential-feature ridge routing and the same evaluation ladder (calibration → frozen holdout → confirmation, with one fallback branch).
Adds an experiment write-up (README.md) and links it from the top-level experiments index.
Commits generated artifacts (cache tables, frozen model JSON, and report CSV/JSON summaries) under outputs/.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
experiments/pose-anisotropy-interventions/bank-invariant-solver/run.py	New experiment driver; builds differential feature vectors and fits/evaluates a frozen ridge chooser.
experiments/pose-anisotropy-interventions/bank-invariant-solver/README.md	Documents motivation, feature set, evaluation ladder, and baseline results for the new experiment.
experiments/README.md	Adds index links for the bank-adaptive and bank-invariant solver experiments.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/models/baseline__frozen_ridge_chooser.json	Stored frozen ridge chooser artifact for the baseline variant.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__calibration_predictions.csv	Calibration-set per-trial predictions from the frozen chooser.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__confirmation_block_predictions.csv	Confirmation-block per-trial predictions from the frozen chooser.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__fit_eval_summary.json	Summary metrics for calibration/holdout (and confirmation if present).
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__full_plan_result.json	Full-plan result payload including final interpretation string.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__holdout_block_1_predictions.csv	Holdout-block per-trial predictions from the frozen chooser.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__ladder_summary.json	Ladder summary across smoke/calibration/holdout/confirmation.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__smoke_calibration_block_1_predictions.csv	Smoke-check predictions on calibration block 1.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/reports/baseline__smoke_calibration_block_1_summary.json	Smoke-check summary metrics on calibration block 1.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__calibration_block_1.csv	Cached trial table for calibration block 1.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__calibration_block_1.json	Cache metadata + per-cell/per-condition summary for calibration block 1.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__calibration_block_2.csv	Cached trial table for calibration block 2.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__calibration_block_2.json	Cache metadata + per-cell/per-condition summary for calibration block 2.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__holdout_block_1.csv	Cached trial table for holdout block 1.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__holdout_block_1.json	Cache metadata + per-cell/per-condition summary for holdout block 1.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__confirmation_block.csv	Cached trial table for confirmation block.
experiments/pose-anisotropy-interventions/bank-invariant-solver/outputs/cache/baseline__confirmation_block.json	Cache metadata + per-cell/per-condition summary for confirmation block.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-25T08:34:19Z

experiments/pose-anisotropy-interventions/bank-invariant-solver/README.md

+| `cv_score_diff` | `joint_cv − support_cv` | First-order invariant |
+| `abs_log_alpha_diff` | `|log(joint_α) − log(support_α)|` | Magnitude of disagreement |
+| `abs_t_diff` | `|joint_t − support_t|` | Magnitude of disagreement |
+| `score_sign` | `sign(joint_score − support_score)` | Rank-based (fully invariant) |


Copilot · 2026-03-25T08:34:19Z

experiments/pose-anisotropy-interventions/bank-invariant-solver/run.py

+differentials and ratios.  Differentials between support and joint candidates
+are first-order invariant to additive bank-induced shifts, and ratios are
+first-order invariant to multiplicative shifts.  The routing decision only
+needs to know which candidate is better, not the absolute quality level,
+so differential features retain all relevant information.
+
+Key design change vs the bank-adaptive solver:
+
+- Feature vector uses only candidate differentials, not absolute scores.
+  The cell one-hot encoding is retained for support-type conditioning.


The module docstring claims the feature set includes “ratios” and is invariant to multiplicative shifts, but the implemented feature vector only uses differentials/abs-diffs plus a sign feature (no ratio features are constructed). This makes the experiment description misleading—either remove the ratio/multiplicative-shift claims or add the corresponding ratio features and include them in FEATURE_NAMES/build_feature_vector.

Suggested change

differentials and ratios. Differentials between support and joint candidates

are first-order invariant to additive bank-induced shifts, and ratios are

first-order invariant to multiplicative shifts. The routing decision only

needs to know which candidate is better, not the absolute quality level,

so differential features retain all relevant information.

Key design change vs the bank-adaptive solver:

- Feature vector uses only candidate differentials, not absolute scores.

The cell one-hot encoding is retained for support-type conditioning.

differentials. Differentials between support and joint candidates are

first-order invariant to additive bank-induced shifts. The routing decision

only needs to know which candidate is better, not the absolute quality level,

so differential features retain all relevant information.

Key design change vs the bank-adaptive solver:

- Feature vector uses only candidate differentials (plus any derived sign

indicators), not absolute scores. The cell one-hot encoding is retained

for support-type conditioning.

Copilot · 2026-03-25T08:34:19Z

experiments/pose-anisotropy-interventions/bank-invariant-solver/run.py

+            cv_diff,
+            abs(log_alpha_diff),
+            abs(t_diff),
+            1.0 if score_diff < 0.0 else -1.0,


The score_sign feature is documented as sign(joint_score − support_score), but the implementation returns +1 when joint_score < support_score and -1 otherwise (i.e., it’s closer to sign(support_score − joint_score) and also maps ties to -1). This mismatch makes the feature semantics hard to reason about and can lead to incorrect interpretation of learned weights; please align the computation with the documented definition (including a tie behavior), or update the docs/feature name to reflect the actual meaning.

Suggested change

1.0 if score_diff < 0.0 else -1.0,

1.0 if score_diff > 0.0 else (-1.0 if score_diff < 0.0 else 0.0),

Copilot · 2026-03-25T08:34:20Z

experiments/pose-anisotropy-interventions/bank-invariant-solver/README.md

+features (support minus joint, or joint minus support).  Differentials are
+first-order invariant to additive bank-induced shifts: when the bank changes,
+both candidates' scores shift together, but their difference is preserved.


This section says the differential features can be “support minus joint, or joint minus support”, but the actual implementation (and later table) consistently uses joint − support. Consider tightening the wording to a single convention so readers don’t invert feature signs when comparing against the model artifact/results.

Suggested change

features (support minus joint, or joint minus support). Differentials are

first-order invariant to additive bank-induced shifts: when the bank changes,

both candidates' scores shift together, but their difference is preserved.

features, consistently defined as `joint − support` (joint candidate metric

minus support candidate metric). These differentials are first-order

invariant to additive bank-induced shifts: when the bank changes, both

candidates' scores shift together, but their difference is preserved.

Copilot AI and others added 2 commits March 25, 2026 06:39

Add bank-invariant solver experiment with differential features

b82c26a

Co-authored-by: zfifteen <221906715+zfifteen@users.noreply.github.com> Agent-Logs-Url: https://github.com/zfifteen/shape-budget/sessions/36dc4936-e048-421a-a7c5-b32763e92cf7

Complete bank-invariant solver experiment: full pipeline outputs and …

64dff9e

…results Co-authored-by: zfifteen <221906715+zfifteen@users.noreply.github.com> Agent-Logs-Url: https://github.com/zfifteen/shape-budget/sessions/fac03558-583b-4ac6-ab01-23010a56fc36

Copilot AI assigned Copilot and zfifteen Mar 25, 2026

Copilot created this pull request from a session on behalf of zfifteen March 25, 2026 08:23 View session

zfifteen marked this pull request as ready for review March 25, 2026 08:24

Copilot AI review requested due to automatic review settings March 25, 2026 08:24

Copilot started reviewing on behalf of zfifteen March 25, 2026 08:25 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bank-invariant solver experiment with differential features#5

Add bank-invariant solver experiment with differential features#5
Copilot wants to merge 2 commits intomainfrom
copilot/work-on-solver-challenges

Copilot AI commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	1.0 if score_diff < 0.0 else -1.0,
	1.0 if score_diff > 0.0 else (-1.0 if score_diff < 0.0 else 0.0),

Conversation

Copilot AI commented Mar 25, 2026

Design

Results (baseline, bank_size=300)

Files

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants