Skip to content

Commit 70a2a58

Browse files
hua7450claude
andcommitted
Fix review-fix loop skipping and narrow reference search
Phase 6: Replace conditional while-loop with 3 explicitly named rounds (6.1, 6.2, 6.3). The orchestrator can no longer skip re-reviews after fixes — each round is a mandatory named step, not a loop iteration. Consolidator: Search for reference implementations by concept keywords (e.g., 'child', 'care') instead of program acronym. This prevents missing better references like TX CCS and DC CCSP when implementing a CCAP program. Parameter-architect: Remove hardcoded DC/IL/TX TANF paths. The agent now reads reference implementations from the consolidator's impl-spec. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 60c2afc commit 70a2a58

File tree

3 files changed

+127
-47
lines changed

3 files changed

+127
-47
lines changed

agents/country-models/parameter-architect.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,8 @@ Specifically:
4646
- Section 2.3: "Description Validation Checklist" - Run this on every description
4747

4848
**ALWAYS study existing implementations FIRST:**
49-
- DC TANF: `/policyengine_us/parameters/gov/states/dc/dhs/tanf/`
50-
- IL TANF: `/policyengine_us/parameters/gov/states/il/dhs/tanf/`
51-
- TX TANF: `/policyengine_us/parameters/gov/states/tx/hhs/tanf/`
49+
The impl-spec lists reference implementations discovered by the consolidator. Read 3+
50+
parameter files from the best-matching reference implementation listed there.
5251

5352
Learn from them:
5453
1. Folder structure and organization patterns
@@ -58,7 +57,7 @@ Learn from them:
5857
5. How they organize income/, eligibility/, resources/ folders
5958

6059
**MANDATORY: Before writing ANY parameter:**
61-
- Open and READ 3+ similar parameter files from TX/IL/DC
60+
- Open and READ 3+ similar parameter files from the reference implementation
6261
- COPY their exact description pattern from the skill templates
6362
- Replace ONLY state name (keep everything else identical)
6463
- **ALWAYS spell out full program names** (e.g., "Temporary Assistance for Needy Families program", not "TANF")
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Replace review-fix loop with 3 mandatory named rounds; broaden consolidator reference search by concept keywords instead of acronyms; remove hardcoded TANF references from parameter-architect

commands/encode-policy-v2.md

Lines changed: 123 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -122,10 +122,20 @@ name: "consolidator"
122122
123123
"Read sources/working_references.md and produce structured implementation specs for {STATE} {PROGRAM}.
124124
125-
STEP 1: Study reference implementation.
126-
Find a similar program in the codebase (e.g., CO CCAP for RI CCAP, DC TANF for OR TANF).
127-
Search with: Glob 'policyengine_us/variables/gov/states/*/[agency]/{prog}/*.py'
128-
Read 3-5 variable files and 3-5 parameter files from the reference implementation.
125+
STEP 1: Study reference implementations (search BROADLY).
126+
States use different names for the same program type — do NOT search only by the target
127+
program's acronym. Instead, derive 2-3 concept keywords from what the program does
128+
(e.g., a child care subsidy program → search 'child', 'care', 'provider').
129+
130+
Search with MULTIPLE globs using concept keywords:
131+
Glob 'policyengine_us/variables/gov/states/*/*/*{keyword1}*/*.py'
132+
Glob 'policyengine_us/variables/gov/states/*/*/*{keyword2}*/*.py'
133+
134+
Identify ALL matching state implementations. Read 3-5 variable files and 3-5 parameter
135+
files from the BEST reference implementation — pick the one with the most similar structure
136+
to the target program (e.g., if target has multi-dimensional rates, pick a reference that
137+
has enum-keyed rate lookups, not a simpler eligibility-only impl).
138+
List ALL discovered implementations in the impl-spec so downstream agents can study them.
129139
130140
STEP 2: Discover existing reusable variables.
131141
For each key concept in the program (income, hours, age, household size, childcare, etc.),
@@ -685,65 +695,118 @@ Read ONLY `/tmp/{PREFIX}-final-report.md`.
685695

686696
---
687697

688-
## Phase 6: Review-Fix Loop
698+
## Phase 6: Review-Fix (3 Mandatory Rounds)
689699

690700
**Skip if `--skip-review`.**
691701

692-
This phase runs `/review-program` and fixes critical issues in a loop until zero critical issues remain (or max iterations reached).
702+
This phase runs 3 independent review rounds. Each review is done by a fresh `/review-program` invocation. After any fix, the next review is a **mandatory step** — the orchestrator has NO discretion to skip it. Only an actual review confirming critical == 0 can end the phase early.
703+
704+
---
705+
706+
### Round 1: Initial Review
707+
708+
#### Step 6.1A: Run /review-program
709+
710+
```
711+
Skill: review-program
712+
Arguments: $PR_NUMBER --local --full [--600dpi if DPI == 600]
713+
```
714+
715+
#### Step 6.1B: Check Results
716+
717+
Read `/tmp/review-program-summary.md` (max 20 lines).
718+
719+
**If critical == 0**: Report to user. **Phase 6 complete — skip remaining rounds.**
720+
721+
**If critical > 0**: Proceed to Step 6.1C.
722+
723+
#### Step 6.1C: Fix Critical Issues
724+
725+
```
726+
subagent_type: "complete:country-models:rules-engineer"
727+
team_name: "{PREFIX}-encode"
728+
name: "review-fixer-1"
729+
730+
"Fix the critical issues from the /review-program review (round 1).
731+
Read the full review report at /tmp/review-program-full-report.md.
732+
Focus ONLY on items marked CRITICAL — do not change anything else.
733+
Load skills: /policyengine-variable-patterns, /policyengine-code-style,
734+
/policyengine-parameter-patterns, /policyengine-period-patterns, /policyengine-vectorization.
735+
Apply fixes. Run make format.
736+
737+
REUSE EXISTING VARIABLES: Before creating any non-program-specific variable, Grep the
738+
codebase first. PolicyEngine-US likely already has it (fpg, smi, tanf_fpg, ssi, etc.).
739+
740+
LEARN FROM PAST SESSIONS (read if they exist — skip if not found):
741+
- {LESSONS_PATH}
742+
- lessons/agent-lessons.md
743+
744+
AFTER fixing, write your fixes to /tmp/{PREFIX}-checklist.md:
745+
Format each line as:
746+
- [ROUND 1] [{CATEGORY}] {file}:{line} — {what was wrong} → {what you changed}
747+
748+
Categories: HARD-CODED, WRONG-PERIOD, MISSING-REF, BAD-REF, DEDUCTION-ORDER,
749+
UNUSED-PARAM, WRONG-ENTITY, NAMING, FORMULA-LOGIC, TEST-GAP, OTHER"
750+
```
693751

694-
### Loop Structure
752+
#### Step 6.1D: Run Tests & Commit
753+
754+
```
755+
subagent_type: "complete:country-models:ci-fixer"
756+
team_name: "{PREFIX}-encode"
757+
name: "ci-fixer-1"
695758
759+
"Run tests for {STATE} {PROGRAM} after review-fix round 1.
760+
Fix any test failures introduced by the fixes. Run make format."
696761
```
697-
ROUND = 1
698-
MAX_ROUNDS = 3
699762

700-
while ROUND <= MAX_ROUNDS:
701-
1. Run /review-program $PR_NUMBER --local --full
702-
2. Read /tmp/review-program-summary.md → count critical issues
703-
3. If critical == 0 → EXIT LOOP (success)
704-
4. If ROUND == MAX_ROUNDS → EXIT LOOP (escalate to user)
705-
5. If ROUND == 2 → ask user before attempting round 3
706-
6. Fix critical issues
707-
7. Run make format + tests
708-
8. Commit + push fixes
709-
9. ROUND += 1
763+
```bash
764+
git add policyengine_us/parameters/gov/states/{ST}/ policyengine_us/variables/gov/states/{ST}/ policyengine_us/tests/policy/baseline/gov/states/{ST}/
765+
git commit -m "Review-fix round 1: address critical issues from /review-program"
766+
git push
710767
```
711768

712-
### Step 6A: Run /review-program (Round N)
769+
**Proceed to Round 2. This is mandatory — do NOT skip.**
713770

714-
Invoke the `review-program` skill in local-only mode with `--full`:
771+
---
772+
773+
### Round 2: Verification Review
774+
775+
#### Step 6.2A: Run /review-program
715776

716777
```
717778
Skill: review-program
718779
Arguments: $PR_NUMBER --local --full [--600dpi if DPI == 600]
719780
```
720781

721-
If the user passed `--600dpi`, include it here so PDF audit uses high resolution.
722-
723-
### Step 6B: Check Results
782+
#### Step 6.2B: Check Results
724783

725784
Read `/tmp/review-program-summary.md` (max 20 lines).
726785

727-
**If critical == 0**: Report to user and exit loop.
786+
**If critical == 0**: Report to user. **Phase 6 complete — skip Round 3.**
728787

729-
**If critical > 0 and ROUND < MAX_ROUNDS**: Proceed to Step 6C.
788+
**If critical > 0**: Ask user before proceeding:
730789

731-
**If critical > 0 and ROUND == 2**: Ask user before round 3:
732790
```
733-
"Review found {N} critical issues after 2 fix rounds. Attempt a 3rd round?"
734-
Options: "Yes, try one more round" / "No, stop and show remaining issues"
791+
AskUserQuestion:
792+
Question: "Round 2 review found {N} critical issues after round 1 fixes. Attempt a 3rd round?"
793+
Options:
794+
- "Yes, try one more round"
795+
- "No, stop and show remaining issues"
735796
```
736797

737-
**If critical > 0 and ROUND == MAX_ROUNDS (3)**: Exit loop, report remaining issues.
798+
If user says no → report remaining issues, **Phase 6 complete.**
799+
800+
If user says yes → proceed to Step 6.2C.
738801

739-
### Step 6C: Fix Critical Issues
802+
#### Step 6.2C: Fix Critical Issues
740803

741804
```
742805
subagent_type: "complete:country-models:rules-engineer"
743806
team_name: "{PREFIX}-encode"
744-
name: "review-fixer-{ROUND}"
807+
name: "review-fixer-2"
745808
746-
"Fix the critical issues from the /review-program review (round {ROUND}).
809+
"Fix the critical issues from the /review-program review (round 2).
747810
Read the full review report at /tmp/review-program-full-report.md.
748811
Focus ONLY on items marked CRITICAL — do not change anything else.
749812
Load skills: /policyengine-variable-patterns, /policyengine-code-style,
@@ -758,37 +821,54 @@ LEARN FROM PAST SESSIONS (read if they exist — skip if not found):
758821
- lessons/agent-lessons.md
759822
760823
LEARN FROM PREVIOUS ROUNDS:
761-
If /tmp/{PREFIX}-checklist.md exists, read it FIRST. It contains issues
762-
found and fixed in previous rounds. Do NOT reintroduce any of those patterns.
824+
Read /tmp/{PREFIX}-checklist.md FIRST. It contains issues found and fixed in round 1.
825+
Do NOT reintroduce any of those patterns.
763826
764827
AFTER fixing, APPEND your fixes to /tmp/{PREFIX}-checklist.md:
765828
Format each line as:
766-
- [ROUND {ROUND}] [{CATEGORY}] {file}:{line} — {what was wrong} → {what you changed}
829+
- [ROUND 2] [{CATEGORY}] {file}:{line} — {what was wrong} → {what you changed}
767830
768831
Categories: HARD-CODED, WRONG-PERIOD, MISSING-REF, BAD-REF, DEDUCTION-ORDER,
769832
UNUSED-PARAM, WRONG-ENTITY, NAMING, FORMULA-LOGIC, TEST-GAP, OTHER"
770833
```
771834

772-
### Step 6D: Verify Fix & Commit
773-
774-
Run ci-fixer, then commit and push:
835+
#### Step 6.2D: Run Tests & Commit
775836

776837
```
777838
subagent_type: "complete:country-models:ci-fixer"
778839
team_name: "{PREFIX}-encode"
779-
name: "ci-fixer-{ROUND}"
840+
name: "ci-fixer-2"
780841
781-
"Run tests for {STATE} {PROGRAM} after review-fix round {ROUND}.
842+
"Run tests for {STATE} {PROGRAM} after review-fix round 2.
782843
Fix any test failures introduced by the fixes. Run make format."
783844
```
784845

785846
```bash
786847
git add policyengine_us/parameters/gov/states/{ST}/ policyengine_us/variables/gov/states/{ST}/ policyengine_us/tests/policy/baseline/gov/states/{ST}/
787-
git commit -m "Review-fix round {ROUND}: address critical issues from /review-program"
848+
git commit -m "Review-fix round 2: address critical issues from /review-program"
788849
git push
789850
```
790851

791-
Increment ROUND and go back to Step 6A.
852+
**Proceed to Round 3. This is mandatory — do NOT skip.**
853+
854+
---
855+
856+
### Round 3: Final Review
857+
858+
#### Step 6.3A: Run /review-program
859+
860+
```
861+
Skill: review-program
862+
Arguments: $PR_NUMBER --local --full [--600dpi if DPI == 600]
863+
```
864+
865+
#### Step 6.3B: Check Results
866+
867+
Read `/tmp/review-program-summary.md` (max 20 lines).
868+
869+
**If critical == 0**: Report to user. **Phase 6 complete.**
870+
871+
**If critical > 0**: Report remaining issues to user. No more fix rounds — escalate for manual resolution. **Phase 6 complete.**
792872

793873
---
794874

0 commit comments

Comments
 (0)