Skip to content

Commit 7eab067

Browse files
authored
Merge pull request #106 from hua7450/ri-ccap-pipeline-lessons
Fix severity classification and agent scope from RI CCAP lessons
2 parents 5db072c + 0ac4d4b commit 7eab067

File tree

11 files changed

+175
-12
lines changed

11 files changed

+175
-12
lines changed

agents/country-models/edge-case-generator.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,9 @@ For time-based rules:
149149
benefit: 0 # Or reduced amount
150150
```
151151
152+
### Bracket Boundary Consistency
153+
When testing bracket boundaries, you do NOT need to test every threshold — test a few representative ones (first, one in the middle, last). But if you find that a boundary uses "above X%" (exclusive) semantics and needs a 0.0001 shift (see `/policyengine-parameter-patterns` — "Above X%" bracket boundaries), flag ALL thresholds in the same bracket — the boundary semantics applies consistently across the whole bracket, not just the one you tested.
154+
152155
## Auto-Generation Process
153156

154157
### Phase 1: Code Analysis

agents/country-models/implementation-validator.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ This ensures you have the complete patterns and standards loaded for reference t
6161
## Critical Violations (Automatic Rejection)
6262

6363
### 1. Hard-Coded Numeric Values
64-
Any numeric literal (except 0, 1 for basic operations) must come from parameters:
64+
Any numeric literal (except 0, 1, 2 for basic operations) must come from parameters:
6565
- Thresholds, limits, amounts
6666
- Percentages, rates, factors
6767
- Dates, months, periods
@@ -171,6 +171,9 @@ Validate that:
171171
- All variables with formulas have tests
172172
- References trace to real documents
173173
- No orphaned files
174+
- No empty directories in the program folder (leftover from branch switches or restructuring).
175+
Run: `find policyengine_us/{parameters,variables}/gov/states/{ST}/ -type d -empty`
176+
Delete any found — git doesn't track empty directories and they cause confusion.
174177

175178
**CRITICAL: Parameter Usage Validation**
176179
- Every parameter file MUST be used by at least one variable
@@ -349,7 +352,9 @@ The validator produces a **structured report with specific fixes** that ci-fixer
349352
return where(eligible, benefit_amount, 0)
350353
```
351354

352-
### Hard-Coded Values (need parameters)
355+
### Hard-Coded Values — CRITICAL (need parameters)
356+
Every hard-coded numeric value (except 0, 1, 2) is CRITICAL severity — no exceptions for ages,
357+
thresholds, rates, or counts. Do NOT downgrade to moderate/warning.
353358
| File | Line | Value | Create Parameter |
354359
|------|------|-------|------------------|
355360
| benefit.py | 23 | 0.3 | `benefit_rate.yaml` with value 0.3 |

agents/country-models/parameter-architect.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -306,6 +306,15 @@ Before finalizing, validate your work against ALL loaded skills:
306306

307307
Run through each skill's Quick Checklist if available.
308308

309+
## Scope Boundary
310+
311+
**You create PARAMETER YAML files ONLY.** Do NOT create:
312+
- Variable `.py` files — the rules-engineer agent handles these
313+
- Test `.yaml` files — the test-creator agent handles these
314+
- Enum classes or any Python code
315+
316+
Even if you know what the variables and tests should look like, stay in your lane. Other agents are specialized for those tasks and will produce higher-quality output. If you create files outside your scope, the orchestrator may skip those specialized agents, degrading overall quality.
317+
309318
## Quality Standards
310319

311320
Parameters must have:

agents/country-models/test-creator.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,4 +147,5 @@ Tests must:
147147
- Include edge cases at thresholds
148148
- Document calculation steps in comments
149149
- Cover all eligibility paths
150-
- Use only existing PolicyEngine variables
150+
- Use only existing PolicyEngine variables
151+
- NOT exhaustively test every entry in a lookup table — for brackets indexed by household size, FPL tier, etc., test a few representative points (first, middle, last) not every value

agents/reference-validator.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,9 +112,18 @@ reference:
112112

113113
**Flag as CRITICAL if:**
114114
- Clicking link doesn't show the value
115-
- Section number too vague (missing subsections)
115+
- Section number is WRONG (title cites a section that doesn't contain the parameter value — e.g., citing `4.3.1(B)` when the value is actually in `4.3.1(A)(4)`). A wrong section citation is worse than a missing one because it points readers to incorrect regulatory text.
116116
- PDF missing page number
117117

118+
**When proposing a corrected citation**, verify the replacement against the PDF text or
119+
extracted text file. Search for the exact section/definition heading in the text — do not
120+
guess the correct number from memory or nearby context. If you cannot confirm the correct
121+
citation from the source text, flag it as "WRONG — correct section unknown, manual lookup
122+
required" rather than proposing an unverified replacement.
123+
124+
**Flag as WARNING if:**
125+
- Section number too vague (missing subsections) but still within the correct provision
126+
118127
### Phase 3: Check Corroboration
119128

120129
**The reference must explicitly support the value.**
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Improve severity classification, agent scope boundaries, and test quality rules based on RI CCAP implementation lessons

commands/encode-policy-v2.md

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -127,16 +127,32 @@ Find a similar program in the codebase (e.g., CO CCAP for RI CCAP, DC TANF for O
127127
Search with: Glob 'policyengine_us/variables/gov/states/*/[agency]/{prog}/*.py'
128128
Read 3-5 variable files and 3-5 parameter files from the reference implementation.
129129
130-
STEP 2: Write THREE files:
130+
STEP 2: Discover existing reusable variables.
131+
For each key concept in the program (income, hours, age, household size, childcare, etc.),
132+
Grep the codebase for related variables:
133+
Grep 'class.*{concept}.*Variable' policyengine_us/variables/
134+
For example, a childcare program should search for 'childcare', 'hours', 'provider', 'care'.
135+
List all discovered variables in the impl-spec under '## Existing Variables to Reuse' so
136+
downstream agents know what's already available and don't recreate them as bare inputs.
137+
138+
STEP 3: Verify citations against PDF text.
139+
You already have the full documentation loaded. Before writing the impl-spec, cross-check
140+
every section citation (statute number, manual section, definition number) against the
141+
actual PDF text. Search the extracted text for the exact section/definition heading.
142+
If a citation doesn't match (e.g., text says 'Definition 18' but you wrote 'Definition 19'),
143+
correct it NOW — downstream agents will copy these citations verbatim into parameter files.
144+
145+
STEP 4: Write THREE files:
131146
132147
FILE 1: /tmp/{PREFIX}-impl-spec.md (FULL — for implementation agents)
133148
- Every requirement from documentation, numbered (REQ-001, REQ-002, ...)
134149
- Each requirement tagged: ELIGIBILITY, INCOME, BENEFIT, EXEMPTION, DEMOGRAPHIC, IMMIGRATION, RESOURCE, etc.
135150
- Suggested variable and parameter structure (based on reference impl patterns)
151+
- Existing variables to reuse (from Step 2 discovery — variable name, entity, description)
136152
- Income sources list (for sources.yaml parameter, NOT inline adds)
137153
- Reference implementation paths to study
138154
- For TANF: note simplified vs full approach recommendation
139-
- For each requirement: cite the source (statute, manual section, page)
155+
- For each requirement: cite the source (statute, manual section, page) — verified against PDF text in Step 2
140156
141157
FILE 2: /tmp/{PREFIX}-requirements-checklist.md (SHORT — for orchestrator, max 40 lines)
142158
- One line per requirement:
@@ -322,6 +338,8 @@ RULES:
322338

323339
### Step 3B: Create Variables and Tests (Parallel)
324340

341+
**ORCHESTRATOR RULE: Always spawn BOTH agents below, even if the parameter-architect created variable or test files.** Each agent is specialized for its task. If a previous agent went out of scope and created files that aren't its responsibility, the specialized agent will overwrite or improve them. Never skip an agent because "the files already exist."
342+
325343
After parameters are complete, spawn both in parallel — they work on different folders.
326344

327345
**Agent: Rules Engineer**
@@ -642,6 +660,11 @@ Closes #{ISSUE_NUMBER}
642660
## Not Modeled
643661
{Excluded requirements with reasons}
644662
663+
## Historical Notes
664+
If the regulatory reviewer or document-collector noted historical changes to any parameter
665+
values (e.g., threshold increases, rate changes), document what changed and when, with
666+
source links. This helps reviewers understand why parameters only start at a recent date.
667+
645668
## Files Added
646669
{Tree structure of new files}
647670

commands/review-program.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -756,9 +756,12 @@ TASK:
756756
AND visual verification (Step 5D). Note REJECTED mismatches as 'investigated and cleared'.
757757
4. Classify each finding:
758758
- CRITICAL (Must Fix): regulatory mismatches, value mismatches (code-path confirmed + 600 DPI verified),
759-
hard-coded values, missing/non-corroborating references, CI failures, incorrect formulas
760-
- SHOULD ADDRESS: code pattern violations, missing edge case tests, naming conventions,
761-
period usage errors, formatting issues (params & vars)
759+
hard-coded values, missing/non-corroborating references, incorrect section citations
760+
(reference title cites wrong section/subsection), CI failures, incorrect formulas,
761+
formula variables with zero test coverage (no unit test at all),
762+
non-functional tests (e.g., absolute_error_margin >= 1 on boolean outputs)
763+
- SHOULD ADDRESS: code pattern violations, missing edge case tests for already-tested variables,
764+
naming conventions, period usage errors, formatting issues (params & vars)
762765
- SUGGESTIONS: documentation improvements, performance optimizations, code style
763766
764767
5. Write FULL report to /tmp/{PREFIX}-review-full-report.md (for archival/posting)

skills/technical-patterns/policyengine-parameter-patterns-skill/SKILL.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -467,6 +467,61 @@ brackets:
467467

468468
**Real-world example:** Hawaii Food/Excise Tax Credit uses AGI brackets. The first threshold must be `-.inf` to correctly handle taxpayers with negative AGI (e.g., business losses).
469469

470+
### Bracket Boundary: "Above X%" Regulations
471+
472+
**CRITICAL: PolicyEngine's `single_amount` bracket uses "at or above threshold" logic.** A value exactly at the threshold gets that bracket's rate, not the previous bracket's. When a regulation says "above X%" (meaning X% itself belongs to the lower bracket), shift the threshold by `0.0001` to match.
473+
474+
**Example — co-payment tiers by FPL:**
475+
```
476+
Regulation says:
477+
Level 0: Less than or equal to 100% → $0
478+
Level 1: Above 100% up to and including 125% → 2%
479+
Level 2: Above 125% up to and including 150% → 5%
480+
Level 3: Above 150% up to and including 261% → 7%
481+
```
482+
483+
**❌ WRONG — family at exactly 100% FPL gets 2% instead of 0%:**
484+
```yaml
485+
brackets:
486+
- threshold:
487+
2024-01-01: 0
488+
amount:
489+
2024-01-01: 0
490+
- threshold:
491+
2024-01-01: 1.0 # ❌ At 100% FPL → hits this bracket
492+
amount:
493+
2024-01-01: 0.02
494+
```
495+
496+
**✅ CORRECT — shift by 0.0001 to encode "above X%":**
497+
```yaml
498+
brackets:
499+
- threshold:
500+
2024-01-01: 0
501+
amount:
502+
2024-01-01: 0
503+
- threshold:
504+
2024-01-01: 1.0001 # ✅ "Above 100%" → 100% stays in previous bracket
505+
amount:
506+
2024-01-01: 0.02
507+
- threshold:
508+
2024-01-01: 1.2501 # ✅ "Above 125%"
509+
amount:
510+
2024-01-01: 0.05
511+
- threshold:
512+
2024-01-01: 1.5001 # ✅ "Above 150%"
513+
amount:
514+
2024-01-01: 0.07
515+
```
516+
517+
**When to apply the 0.0001 shift:**
518+
- Regulation says "above X%" or "more than X%" (exclusive of the boundary)
519+
- Apply consistently to ALL thresholds in the bracket, not just the first
520+
521+
**When NOT to shift:**
522+
- Regulation says "at or above X%" or "X% or more" (inclusive — matches PolicyEngine's default)
523+
- Regulation says "at least X%" (inclusive)
524+
470525
### Parameter Structure Transitions (Flat → Bracket)
471526
472527
**When a parameter changes structure over time** (e.g., a flat rate becomes a tiered/marginal rate in a later year), you CANNOT put both structures in a single YAML file. Instead, split into separate files with a boolean toggle.

skills/technical-patterns/policyengine-testing-patterns-skill/SKILL.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,11 @@ policyengine_us/tests/policy/baseline/gov/states/[state]/[agency]/[program]/
2424
-`2024-01-01` - Full dates NOT supported
2525

2626
### Error Margin
27-
- Always use `absolute_error_margin: 0.1` after period line
28-
- Allows for small floating-point differences in calculations
29-
- **Never use 1** - a margin of 1 makes `true` (1) and `false` (0) indistinguishable
27+
Choose the margin based on the output type:
28+
- **Boolean outputs** (`true`/`false`, eligibility, flags): **no error margin at all** — booleans are exact, no rounding. Omit `absolute_error_margin` entirely.
29+
- **Currency outputs** (benefits, income, amounts): `absolute_error_margin: 0.01`
30+
- **Rate/percentage outputs**: `absolute_error_margin: 0.001`
31+
- **Never use 1** — a margin of 1 makes `true` (1) and `false` (0) indistinguishable, rendering the test meaningless
3032

3133
### Naming Convention
3234
- Files: `variable_name.yaml` (matches variable exactly)
@@ -674,6 +676,7 @@ When creating tests:
674676
5. **Follow naming conventions** exactly
675677
6. **Include edge cases** at thresholds
676678
7. **Test realistic scenarios** not placeholders
679+
8. **Don't exhaustively test lookup tables** — when a parameter is a bracket indexed by household size, FPL tier, or similar, test a few representative points (first, middle, last/max) not every entry. If the bracket mechanism works for sizes 1, 4, and 10, it works for size 7 too. The same applies to income brackets, age brackets, and any other parameterized lookup.
677680

678681
When optimizing test suites:
679682
1. **Identify slow tests** - Profile with `pytest --durations=10`

0 commit comments

Comments
 (0)