Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c846a22
Get aggregates working!
nikhilwoodruff Sep 28, 2025
93133a1
Add fields
nikhilwoodruff Sep 29, 2025
b79f4f4
Update
nikhilwoodruff Sep 29, 2025
f49e835
Update
nikhilwoodruff Sep 29, 2025
3fa2d49
Update
nikhilwoodruff Sep 29, 2025
0a4b69f
Add user associations
nikhilwoodruff Oct 2, 2025
523d2af
Working sim impacts!
nikhilwoodruff Oct 3, 2025
e6cb739
Move nonneeded code
nikhilwoodruff Oct 6, 2025
56f6d3e
Add tests
nikhilwoodruff Oct 6, 2025
9822cb1
Update
nikhilwoodruff Oct 9, 2025
db7ebb2
Update
nikhilwoodruff Oct 23, 2025
390fe6a
Update
nikhilwoodruff Oct 23, 2025
2f7a628
Go back a bit
nikhilwoodruff Nov 8, 2025
6f6e31b
Folder rename
nikhilwoodruff Nov 8, 2025
74915a0
Add dataset
nikhilwoodruff Nov 8, 2025
b4b12cf
Update
nikhilwoodruff Nov 12, 2025
91d228a
Add progress
nwoodruff-co Nov 12, 2025
702c79e
Update
nwoodruff-co Nov 12, 2025
4813206
Add change-aggregate
nwoodruff-co Nov 12, 2025
5604fb6
Parametric reform handling, plus UK datasets
nwoodruff-co Nov 13, 2025
0742831
Add more analysis functionality
nwoodruff-co Nov 13, 2025
055e8e0
Add US basics
nwoodruff-co Nov 13, 2025
1f272ae
Add us fixes
nwoodruff-co Nov 13, 2025
e8edb7f
Add household analysis example
nikhilwoodruff Nov 16, 2025
bc61df7
US works!
nikhilwoodruff Nov 16, 2025
6d11846
Add macro US output
nikhilwoodruff Nov 16, 2025
be1646b
Standardise
nikhilwoodruff Nov 16, 2025
d569d58
Update pkg
nikhilwoodruff Nov 16, 2025
4fbe994
Versioning
nikhilwoodruff Nov 16, 2025
6e10d90
Add docs
nikhilwoodruff Nov 16, 2025
b76a6ff
Format
nwoodruff-co Nov 17, 2025
3099d87
Format
nwoodruff-co Nov 17, 2025
001b19e
Remove unused deps
nwoodruff-co Nov 17, 2025
048efed
Suppress warning
nwoodruff-co Nov 17, 2025
6a95580
Minor fix
nwoodruff-co Nov 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 0 additions & 15 deletions .env.example

This file was deleted.

7 changes: 3 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
**/*.db
**/__pycache__
**/*.egg-info
**/*.h5
*.ipynb
_build/
simulations/
test.*
supabase/
.env
**/review.md
**/.DS_Store
17 changes: 17 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Claude notes

Claude, please follow these always. These principles are aimed at preventing you from producing AI slop.

1. British English, sentence case
2. No excessive duplication, keep code files as concise as possible to produce the same meaningful value. No excessive printing
3. Don't create multiple files for successive versions. Keep checking: have I added lots of intermediate files which are deprecated? Delete them if so, but ideally don't create them in the first place

## MicroDataFrame

A pandas DataFrame that automatically handles weights for survey microdata. Key features:

- Create with `MicroDataFrame(df, weights='weight_column')`
- All aggregations (sum, mean, etc.) automatically weight results
- Each column is a MicroSeries with weighted operations
- Use `.groupby()` for weighted group statistics
- Built-in poverty analysis: `.poverty_rate()`, `.poverty_gap()`
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,13 @@ format:
ruff format .

clean:
rm -rf **/__pycache__ _build **/_build .pytest_cache .ruff_cache **/*.egg-info **/*.pyc
find . -not -path "./.venv/*" -type d -name "__pycache__" -exec rm -rf {} +
find . -not -path "./.venv/*" -type d -name "_build" -exec rm -rf {} +
find . -not -path "./.venv/*" -type d -name ".pytest_cache" -exec rm -rf {} +
find . -not -path "./.venv/*" -type d -name ".ruff_cache" -exec rm -rf {} +
find . -not -path "./.venv/*" -type d -name "*.egg-info" -exec rm -rf {} +
find . -not -path "./.venv/*" -type f -name "*.pyc" -delete
find . -not -path "./.venv/*" -type f -name "*.h5" -delete

changelog:
build-changelog changelog.yaml --output changelog.yaml --update-last-date --start-from 1.0.0 --append-file changelog_entry.yaml
Expand Down
190 changes: 182 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,186 @@
# PolicyEngine.py

Documentation
A Python package for tax-benefit microsimulation analysis. Run policy simulations, analyse distributional impacts, and visualise results across the UK and US.

- Parameters, variables, and values: `docs/01_parameters_variables.ipynb`
- Policies and dynamic: `docs/02_policies_dynamic.ipynb`
- Datasets: `docs/03_datasets.ipynb`
- Simulations: `docs/04_simulations.ipynb`
- Output data items: `docs/05_output_data_items.ipynb`
- Reports and users: `docs/06_reports_users.ipynb`
## Quick start

Open these notebooks in Jupyter or your preferred IDE to run the examples.
```python
from policyengine.core import Simulation
from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset, uk_latest
from policyengine.outputs.aggregate import Aggregate, AggregateType

# Load representative microdata
dataset = PolicyEngineUKDataset(
name="FRS 2023-24",
filepath="./data/frs_2023_24_year_2026.h5",
year=2026,
)

# Run simulation
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
)
simulation.run()

# Calculate total universal credit spending
agg = Aggregate(
simulation=simulation,
variable="universal_credit",
aggregate_type=AggregateType.SUM,
entity="benunit",
)
agg.run()
print(f"Total UC spending: £{agg.result / 1e9:.1f}bn")
```

## Documentation

**Core concepts:**
- [Core concepts](docs/core-concepts.md): Architecture, datasets, simulations, outputs
- [UK tax-benefit model](docs/country-models-uk.md): Entities, parameters, examples
- [US tax-benefit model](docs/country-models-us.md): Entities, parameters, examples

**Examples:**
- `examples/income_distribution_us.py`: Analyse benefit distribution by decile
- `examples/employment_income_variation_uk.py`: Model employment income phase-outs
- `examples/policy_change_uk.py`: Analyse policy reform impacts

## Installation

```bash
pip install policyengine
```

## Features

- **Multi-country support**: UK and US tax-benefit systems
- **Representative microdata**: Load FRS, CPS, or create custom scenarios
- **Policy reforms**: Parametric reforms with date-bound parameter values
- **Distributional analysis**: Aggregate statistics by income decile, demographics
- **Entity mapping**: Automatic mapping between person, household, tax unit levels
- **Visualisation**: PolicyEngine-branded charts with Plotly

## Key concepts

### Datasets

Datasets contain microdata at entity level (person, household, tax unit). Load representative data or create custom scenarios:

```python
from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset

dataset = PolicyEngineUKDataset(
name="Representative data",
filepath="./data/frs_2023_24_year_2026.h5",
year=2026,
)
dataset.load()
```

### Simulations

Simulations apply tax-benefit models to datasets:

```python
from policyengine.core import Simulation
from policyengine.tax_benefit_models.uk import uk_latest

simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
)
simulation.run()

# Access calculated variables
output = simulation.output_dataset.data
print(output.household[["household_net_income", "household_benefits"]])
```

### Outputs

Extract insights with aggregate statistics:

```python
from policyengine.outputs.aggregate import Aggregate, AggregateType

# Mean income in top decile
agg = Aggregate(
simulation=simulation,
variable="household_net_income",
aggregate_type=AggregateType.MEAN,
filter_variable="household_net_income",
quantile=10,
quantile_eq=10,
)
agg.run()
print(f"Top decile mean income: £{agg.result:,.0f}")
```

### Policy reforms

Apply parametric reforms:

```python
from policyengine.core import Policy, Parameter, ParameterValue
import datetime

parameter = Parameter(
name="gov.hmrc.income_tax.allowances.personal_allowance.amount",
tax_benefit_model_version=uk_latest,
data_type=float,
)

policy = Policy(
name="Increase personal allowance",
parameter_values=[
ParameterValue(
parameter=parameter,
start_date=datetime.date(2026, 1, 1),
end_date=datetime.date(2026, 12, 31),
value=15000,
)
],
)

# Run reform simulation
reform_sim = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
policy=policy,
)
reform_sim.run()
```

## Country models

### UK

Three entity levels:
- **Person**: Individual with income and demographics
- **Benunit**: Benefit unit (single person or couple with children)
- **Household**: Residence unit

Key benefits: Universal Credit, Child Benefit, Pension Credit
Key taxes: Income tax, National Insurance

### US

Six entity levels:
- **Person**: Individual
- **Tax unit**: Federal tax filing unit
- **SPM unit**: Supplemental Poverty Measure unit
- **Family**: Census family definition
- **Marital unit**: Married couple or single person
- **Household**: Residence unit

Key benefits: SNAP, TANF, EITC, CTC, SSI, Social Security
Key taxes: Federal income tax, payroll tax

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.

## License

AGPL-3.0
4 changes: 4 additions & 0 deletions changelog_entry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
- bump: minor
changes:
- Just basemodels, no sqlmodels.
- Clean, working analysis at both household and macro level for uk and us.
Loading