Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion agents/country-models/implementation-validator.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,35 @@ No TODO comments or placeholder returns:

## Validation Process

**Order: Parameters → Variables → Tests** (foundation first, then logic, then verification)
**Order: YAML Structure → Parameters → Variables → Tests** (structure first, then semantics)

### Phase 0: YAML Structural Integrity (Run First)

**Before any semantic checks, verify YAML structure of all parameter files:**

1. **No orphaned values after `metadata:` block** — The `metadata:` section must be the last block in the file. Any date-keyed values (e.g., `2025-10-01: 510`) appearing inside or after `metadata:` are silently lost. This is the #1 cause of missing parameter data.
```yaml
# ❌ WRONG — WY value orphaned after metadata
WV:
2025-10-01: 330
metadata:
unit: currency-USD
2025-10-01: 510 # LOST! Not under any state key

# ✅ CORRECT
WV:
2025-10-01: 330
WY:
2025-10-01: 510
metadata:
unit: currency-USD
```

2. **Breakdown metadata matches actual keys** — If the file uses `breakdown: [variable_name]` in metadata, verify ALL top-level data keys exist in that variable's enum. Mismatches cause ValueError in policyengine-core v2.20+. Common mistake: using `state_code` as breakdown when the file has sub-region keys like `AK_C`, `NY_NYC` (should use `snap_utility_region`).

3. **No duplicate YAML keys** — YAML silently uses the last value for duplicate keys.

4. **Non-standard effective dates** — Some states use different fiscal year start dates (e.g., Indiana uses May 1, Maryland uses January 1 for certain programs). Verify these don't have incorrect date entries that collide with or override the standard October 1 federal cycle.

### Phase 1: Parameter Audit

Expand Down Expand Up @@ -163,6 +191,7 @@ Variable: ar_tea_benefit (has formula) → Needs test file ✅
- Document calculation basis in comments
- Cover edge cases
- Integration test exists for end-to-end scenarios
- **Sub-region/breakdown coverage** — If a variable or parameter uses regional breakdowns (e.g., Alaska's 6 SNAP regions, New York's 3 sub-regions), tests MUST include at least one case per region, plus a default/fallback case for unmapped inputs

### Phase 4: Cross-Reference Check
Validate that:
Expand Down
40 changes: 40 additions & 0 deletions agents/country-models/rules-engineer.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,46 @@ After creating both parameters and variables, perform TWO verification passes:
- [ ] Correct period handling (period.this_year for age/assets/counts)
- [ ] Proper vectorization (no if-elif-else with arrays)
- [ ] References with subsections and `#page=XX` for PDFs
- [ ] **YAML structural integrity** (see below)
- [ ] **Breakdown metadata correctness** (see below)
- [ ] **Multi-source cross-referencing** for parameter values (see below)

#### YAML structural integrity checks

After writing any parameter YAML file, verify:
1. **No values after `metadata:`** — The `metadata:` block must be the LAST section. Any state/region values appearing after `metadata:` are orphaned and silently ignored. This is the #1 cause of missing parameter data.
2. **All top-level keys are either data keys or `metadata`/`description`** — scan the file to confirm no key is accidentally nested under or after `metadata`.
3. **Effective dates are under the correct key** — when a state has non-standard effective dates (e.g., Indiana uses May 1 instead of October 1), double-check that values are placed under the right state key and not accidentally under an adjacent state.

```yaml
# ❌ WRONG — WY value is orphaned after metadata block
WV:
2025-10-01: 330
metadata:
unit: currency-USD
2025-10-01: 510 # This is LOST — not under any state key!

# ✅ CORRECT — all values before metadata
WV:
2025-10-01: 330
WY:
2025-10-01: 510
metadata:
unit: currency-USD
```

#### Breakdown metadata correctness

When a parameter YAML uses `breakdown` in metadata, verify:
1. **The breakdown variable matches the actual keys in the file.** If the file has sub-region keys like `AK_C`, `NY_NYC`, the breakdown must reference the variable whose enum contains those values (e.g., `snap_utility_region`), NOT a more general variable (e.g., `state_code`).
2. **All data keys in the file exist in the breakdown enum.** If any key is not in the enum, policyengine-core will raise a ValueError (as of core v2.20+).

#### Multi-source cross-referencing for parameter values

When entering parameter values from spreadsheets or tables:
1. **Verify values for states with non-standard effective dates** (e.g., Indiana uses May 1, Maryland uses January 1 for some programs). Check whether a new value supersedes or supplements existing values.
2. **For states with sub-regions** (Alaska has 6 SNAP regions, New York has 3), verify each sub-region value individually against the source.
3. **Spot-check at least 5 values** against the original source document after entering all data. Pick values from the beginning, middle, and end of the alphabet.

```bash
uv sync --extra dev && uv run ruff format
Expand Down
3 changes: 2 additions & 1 deletion agents/country-models/test-creator.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,4 +148,5 @@ Tests must:
- Document calculation steps in comments
- Cover all eligibility paths
- Use only existing PolicyEngine variables
- NOT exhaustively test every entry in a lookup table — for brackets indexed by household size, FPL tier, etc., test a few representative points (first, middle, last) not every value
- NOT exhaustively test every entry in a lookup table — for brackets indexed by household size, FPL tier, etc., test a few representative points (first, middle, last) not every value
- **Cover all sub-regions/breakdowns** — If a variable uses regional breakdowns (e.g., Alaska's 6 SNAP regions, New York's 3 sub-regions), create at least one test per region plus a default/fallback test. This catches county-to-region mapping errors and ensures all breakdown keys have correct parameter values.
1 change: 1 addition & 0 deletions changelog.d/improve-data-validation.changed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add YAML structural integrity checks, breakdown metadata validation, multi-source cross-referencing, and sub-region test coverage requirements to rules-engineer, implementation-validator, and test-creator agents.
Loading