Skip to content

Commit 43470ca

Browse files
Merge pull request #180 from PolicyEngine/clean
Clean refactor, enable UK+US macro+household analysis outputs
2 parents efc4fec + 6a95580 commit 43470ca

File tree

98 files changed

+8959
-5322
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

98 files changed

+8959
-5322
lines changed

.env.example

Lines changed: 0 additions & 15 deletions
This file was deleted.

.gitignore

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
**/*.db
22
**/__pycache__
33
**/*.egg-info
4+
**/*.h5
5+
*.ipynb
46
_build/
5-
simulations/
6-
test.*
7-
supabase/
87
.env
9-
**/review.md
8+
**/.DS_Store

CLAUDE.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Claude notes
2+
3+
Claude, please follow these always. These principles are aimed at preventing you from producing AI slop.
4+
5+
1. British English, sentence case
6+
2. No excessive duplication, keep code files as concise as possible to produce the same meaningful value. No excessive printing
7+
3. Don't create multiple files for successive versions. Keep checking: have I added lots of intermediate files which are deprecated? Delete them if so, but ideally don't create them in the first place
8+
9+
## MicroDataFrame
10+
11+
A pandas DataFrame that automatically handles weights for survey microdata. Key features:
12+
13+
- Create with `MicroDataFrame(df, weights='weight_column')`
14+
- All aggregations (sum, mean, etc.) automatically weight results
15+
- Each column is a MicroSeries with weighted operations
16+
- Use `.groupby()` for weighted group statistics
17+
- Built-in poverty analysis: `.poverty_rate()`, `.poverty_gap()`

Makefile

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,13 @@ format:
1212
ruff format .
1313

1414
clean:
15-
rm -rf **/__pycache__ _build **/_build .pytest_cache .ruff_cache **/*.egg-info **/*.pyc
15+
find . -not -path "./.venv/*" -type d -name "__pycache__" -exec rm -rf {} +
16+
find . -not -path "./.venv/*" -type d -name "_build" -exec rm -rf {} +
17+
find . -not -path "./.venv/*" -type d -name ".pytest_cache" -exec rm -rf {} +
18+
find . -not -path "./.venv/*" -type d -name ".ruff_cache" -exec rm -rf {} +
19+
find . -not -path "./.venv/*" -type d -name "*.egg-info" -exec rm -rf {} +
20+
find . -not -path "./.venv/*" -type f -name "*.pyc" -delete
21+
find . -not -path "./.venv/*" -type f -name "*.h5" -delete
1622

1723
changelog:
1824
build-changelog changelog.yaml --output changelog.yaml --update-last-date --start-from 1.0.0 --append-file changelog_entry.yaml

README.md

Lines changed: 182 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,186 @@
11
# PolicyEngine.py
22

3-
Documentation
3+
A Python package for tax-benefit microsimulation analysis. Run policy simulations, analyse distributional impacts, and visualise results across the UK and US.
44

5-
- Parameters, variables, and values: `docs/01_parameters_variables.ipynb`
6-
- Policies and dynamic: `docs/02_policies_dynamic.ipynb`
7-
- Datasets: `docs/03_datasets.ipynb`
8-
- Simulations: `docs/04_simulations.ipynb`
9-
- Output data items: `docs/05_output_data_items.ipynb`
10-
- Reports and users: `docs/06_reports_users.ipynb`
5+
## Quick start
116

12-
Open these notebooks in Jupyter or your preferred IDE to run the examples.
7+
```python
8+
from policyengine.core import Simulation
9+
from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset, uk_latest
10+
from policyengine.outputs.aggregate import Aggregate, AggregateType
11+
12+
# Load representative microdata
13+
dataset = PolicyEngineUKDataset(
14+
name="FRS 2023-24",
15+
filepath="./data/frs_2023_24_year_2026.h5",
16+
year=2026,
17+
)
18+
19+
# Run simulation
20+
simulation = Simulation(
21+
dataset=dataset,
22+
tax_benefit_model_version=uk_latest,
23+
)
24+
simulation.run()
25+
26+
# Calculate total universal credit spending
27+
agg = Aggregate(
28+
simulation=simulation,
29+
variable="universal_credit",
30+
aggregate_type=AggregateType.SUM,
31+
entity="benunit",
32+
)
33+
agg.run()
34+
print(f"Total UC spending: £{agg.result / 1e9:.1f}bn")
35+
```
36+
37+
## Documentation
38+
39+
**Core concepts:**
40+
- [Core concepts](docs/core-concepts.md): Architecture, datasets, simulations, outputs
41+
- [UK tax-benefit model](docs/country-models-uk.md): Entities, parameters, examples
42+
- [US tax-benefit model](docs/country-models-us.md): Entities, parameters, examples
43+
44+
**Examples:**
45+
- `examples/income_distribution_us.py`: Analyse benefit distribution by decile
46+
- `examples/employment_income_variation_uk.py`: Model employment income phase-outs
47+
- `examples/policy_change_uk.py`: Analyse policy reform impacts
48+
49+
## Installation
50+
51+
```bash
52+
pip install policyengine
53+
```
54+
55+
## Features
56+
57+
- **Multi-country support**: UK and US tax-benefit systems
58+
- **Representative microdata**: Load FRS, CPS, or create custom scenarios
59+
- **Policy reforms**: Parametric reforms with date-bound parameter values
60+
- **Distributional analysis**: Aggregate statistics by income decile, demographics
61+
- **Entity mapping**: Automatic mapping between person, household, tax unit levels
62+
- **Visualisation**: PolicyEngine-branded charts with Plotly
63+
64+
## Key concepts
65+
66+
### Datasets
67+
68+
Datasets contain microdata at entity level (person, household, tax unit). Load representative data or create custom scenarios:
69+
70+
```python
71+
from policyengine.tax_benefit_models.uk import PolicyEngineUKDataset
72+
73+
dataset = PolicyEngineUKDataset(
74+
name="Representative data",
75+
filepath="./data/frs_2023_24_year_2026.h5",
76+
year=2026,
77+
)
78+
dataset.load()
79+
```
80+
81+
### Simulations
82+
83+
Simulations apply tax-benefit models to datasets:
84+
85+
```python
86+
from policyengine.core import Simulation
87+
from policyengine.tax_benefit_models.uk import uk_latest
88+
89+
simulation = Simulation(
90+
dataset=dataset,
91+
tax_benefit_model_version=uk_latest,
92+
)
93+
simulation.run()
94+
95+
# Access calculated variables
96+
output = simulation.output_dataset.data
97+
print(output.household[["household_net_income", "household_benefits"]])
98+
```
99+
100+
### Outputs
101+
102+
Extract insights with aggregate statistics:
103+
104+
```python
105+
from policyengine.outputs.aggregate import Aggregate, AggregateType
106+
107+
# Mean income in top decile
108+
agg = Aggregate(
109+
simulation=simulation,
110+
variable="household_net_income",
111+
aggregate_type=AggregateType.MEAN,
112+
filter_variable="household_net_income",
113+
quantile=10,
114+
quantile_eq=10,
115+
)
116+
agg.run()
117+
print(f"Top decile mean income: £{agg.result:,.0f}")
118+
```
119+
120+
### Policy reforms
121+
122+
Apply parametric reforms:
123+
124+
```python
125+
from policyengine.core import Policy, Parameter, ParameterValue
126+
import datetime
127+
128+
parameter = Parameter(
129+
name="gov.hmrc.income_tax.allowances.personal_allowance.amount",
130+
tax_benefit_model_version=uk_latest,
131+
data_type=float,
132+
)
133+
134+
policy = Policy(
135+
name="Increase personal allowance",
136+
parameter_values=[
137+
ParameterValue(
138+
parameter=parameter,
139+
start_date=datetime.date(2026, 1, 1),
140+
end_date=datetime.date(2026, 12, 31),
141+
value=15000,
142+
)
143+
],
144+
)
145+
146+
# Run reform simulation
147+
reform_sim = Simulation(
148+
dataset=dataset,
149+
tax_benefit_model_version=uk_latest,
150+
policy=policy,
151+
)
152+
reform_sim.run()
153+
```
154+
155+
## Country models
156+
157+
### UK
158+
159+
Three entity levels:
160+
- **Person**: Individual with income and demographics
161+
- **Benunit**: Benefit unit (single person or couple with children)
162+
- **Household**: Residence unit
163+
164+
Key benefits: Universal Credit, Child Benefit, Pension Credit
165+
Key taxes: Income tax, National Insurance
166+
167+
### US
168+
169+
Six entity levels:
170+
- **Person**: Individual
171+
- **Tax unit**: Federal tax filing unit
172+
- **SPM unit**: Supplemental Poverty Measure unit
173+
- **Family**: Census family definition
174+
- **Marital unit**: Married couple or single person
175+
- **Household**: Residence unit
176+
177+
Key benefits: SNAP, TANF, EITC, CTC, SSI, Social Security
178+
Key taxes: Federal income tax, payroll tax
179+
180+
## Contributing
181+
182+
See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.
183+
184+
## License
185+
186+
AGPL-3.0

changelog_entry.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
- bump: minor
2+
changes:
3+
- Just basemodels, no sqlmodels.
4+
- Clean, working analysis at both household and macro level for uk and us.

0 commit comments

Comments
 (0)