Add adversarial weight regularisation pipeline by nikhilwoodruff · Pull Request #296 · PolicyEngine/policyengine-uk-data

nikhilwoodruff · 2026-03-17T16:40:02Z

Summary

Adds a diagnostics package implementing the adversarial weight regularisation pipeline from the design doc
Phase 1 (influence detector): Computes per-record influence across a reporting surface of 10 metrics × 4 slice dimensions (income decile, region, age band, tenure). Identifies records exceeding a configurable influence threshold, computes Kish effective sample sizes, and samples across random policy reforms.
Phase 2 (generative model): Trains a TVAE on FRS input attributes via sdv, with conditional sampling using varied conditioning fractions for diverse offspring generation.
Phase 3 (adversarial loop): Iteratively detects worst-offender records, generates synthetic offspring, and replaces high-weight records with weighted offspring.
Phase 4 (regularised recalibration): Entropy-regularised weight optimisation with KL divergence penalty and optional hard weight cap.
Adds a CLI (python -m policyengine_uk_data.diagnostics) with diagnose, train, and regularise commands.
Adds visualisation script producing weight distribution, Kish ESS, influence heatmap, and scatter plots.

Current dataset diagnostics

Running Phase 1 on the enhanced FRS reveals:

53,508 households, median weight 33, max weight 372,747 (skewness 31.4)
274 records exceed the 5% influence threshold
Overall Kish effective sample size: 930 (out of 53k records)
Worst offender (HH #506) has 95.5% influence on housing_benefit_reported/age_band=16-24
Most problematic slices: young age bands (16-24) and specific region×income combinations

Test plan

All new modules pass ruff lint and format checks
Syntax validation passes for all 6 new files
Diagnostics script runs successfully on enhanced_frs_2023_24.h5
Run full adversarial loop on a subset to verify convergence
Compare pre/post weight distributions after regularisation

Introduces a diagnostics package that detects high-influence survey records, generates synthetic offspring via TVAE, and recalibrates with entropy regularisation and weight capping to reduce output noise in population subgroup statistics. Components: - influence.py: reporting surface definition, per-record influence computation, Kish effective sample size, random reform sampling - generative_model.py: TVAE training on FRS input attributes, conditional sampling with varied conditioning fractions - offspring.py: adversarial detect-spawn-recalibrate loop - recalibrate.py: entropy-regularised weight optimisation with optional hard weight cap and zero-weight pruning - __main__.py: CLI with diagnose/train/regularise commands

Produces charts showing weight distribution, Kish effective sample sizes by population slice, high-influence records table, influence heatmap, and weight-vs-influence scatter plot.

nwoodruff-co added 2 commits March 17, 2026 16:37

Add weight diagnostics visualisation script

db6a4b7

Produces charts showing weight distribution, Kish effective sample sizes by population slice, high-influence records table, influence heatmap, and weight-vs-influence scatter plot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add adversarial weight regularisation pipeline#296

Add adversarial weight regularisation pipeline#296
nikhilwoodruff wants to merge 2 commits intomainfrom
feat/adversarial-weight-regularisation

nikhilwoodruff commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nikhilwoodruff commented Mar 17, 2026

Summary

Current dataset diagnostics

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants