Skip to content

Commit e9c9393

Browse files
baogorekclaude
andcommitted
Add --use-tob flag for TOB revenue calibration
Adds calibration targets for Taxation of Benefits (TOB) revenue: - OASDI TOB: tob_revenue_oasdi (tier 1 taxation) - HI TOB: tob_revenue_medicare_hi (tier 2 taxation) Changes: - Add load_oasdi_tob_projections() and load_hi_tob_projections() to ssa_data.py - Add TOB constraint parameters to calibration.py - Add --use-tob CLI flag to run_household_projection.py - Clean hi_tob_billions_nominal_usd column in social_security_aux.csv - Update README.md documentation Closes #459 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 93427cf commit e9c9393

File tree

6 files changed

+216
-84
lines changed

6 files changed

+216
-84
lines changed

changelog_entry.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
- bump: minor
2+
changes:
3+
added:
4+
- Added --use-tob flag for TOB (Taxation of Benefits) revenue calibration targeting OASDI and HI trust fund revenue

policyengine_us_data/datasets/cps/long_term/README.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@
66
Run projections using `run_household_projection.py`:
77

88
```bash
9-
# Recommended: GREG with all three constraint types
10-
python run_household_projection.py 2100 --greg --use-ss --use-payroll --save-h5
9+
# Recommended: GREG with all constraint types
10+
python run_household_projection.py 2100 --greg --use-ss --use-payroll --use-tob --save-h5
1111

1212
# IPF with only age distribution constraints (faster, less accurate)
1313
python run_household_projection.py 2050
@@ -21,6 +21,7 @@ python run_household_projection.py 2100 --greg --use-ss
2121
- `--greg`: Use GREG calibration instead of IPF
2222
- `--use-ss`: Include Social Security benefit totals as calibration target (requires `--greg`)
2323
- `--use-payroll`: Include taxable payroll totals as calibration target (requires `--greg`)
24+
- `--use-tob`: Include TOB (Taxation of Benefits) revenue as calibration target (requires `--greg`)
2425
- `--save-h5`: Save year-specific .h5 files to `./projected_datasets/` directory
2526

2627
**Estimated runtime:** ~2 minutes/year without `--save-h5`, ~3 minutes/year with `--save-h5`
@@ -36,7 +37,7 @@ python run_household_projection.py 2100 --greg --use-ss
3637

3738
**GREG (Generalized Regression Estimator)**
3839
- Solves for weights matching multiple constraints simultaneously
39-
- Can enforce age distribution + Social Security benefits + taxable payroll
40+
- Can enforce age distribution + Social Security benefits + taxable payroll + TOB revenue
4041
- One-shot solution using `samplics` package
4142
- **Recommended** for accurate long-term projections
4243

@@ -57,14 +58,20 @@ python run_household_projection.py 2100 --greg --use-ss
5758
- Calculated as: `taxable_earnings_for_social_security` + `social_security_taxable_self_employment_income`
5859
- Source: SSA Trustee Report 2024 (`social_security_aux.csv`)
5960

61+
4. **TOB Revenue** (`--use-tob`, GREG only)
62+
- Taxation of Benefits revenue for OASDI and Medicare HI trust funds
63+
- OASDI: `tob_revenue_oasdi` (tier 1 taxation, 0-50% of benefits)
64+
- HI: `tob_revenue_medicare_hi` (tier 2 taxation, 50-85% of benefits)
65+
- Source: SSA Trustee Report 2024 (`social_security_aux.csv`)
66+
6067
---
6168

6269
### Data Sources
6370

6471
All data from **SSA 2024 Trustee Report**:
6572

6673
- `SSPopJul_TR2024.csv` - Population projections 2025-2100 by single year of age
67-
- `social_security_aux.csv` - OASDI costs and taxable payroll projections 2025-2100
74+
- `social_security_aux.csv` - OASDI costs, taxable payroll, and TOB revenue projections 2025-2100
6875
- Extracted from `SingleYearTRTables_TR2025.xlsx` Table VI.G9 using `extract_ssa_costs.py`
6976

7077
Files located in: `policyengine_us_data/storage/`

policyengine_us_data/datasets/cps/long_term/calibration.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,10 @@ def calibrate_greg(
8585
payroll_target=None,
8686
h6_income_values=None,
8787
h6_revenue_target=None,
88+
oasdi_tob_values=None,
89+
oasdi_tob_target=None,
90+
hi_tob_values=None,
91+
hi_tob_target=None,
8892
n_ages=86,
8993
):
9094
"""
@@ -101,6 +105,10 @@ def calibrate_greg(
101105
payroll_target: Optional taxable payroll target total
102106
h6_income_values: Optional H6 reform income values per household
103107
h6_revenue_target: Optional H6 reform total revenue impact target
108+
oasdi_tob_values: Optional OASDI TOB revenue values per household
109+
oasdi_tob_target: Optional OASDI TOB revenue target total
110+
hi_tob_values: Optional HI TOB revenue values per household
111+
hi_tob_target: Optional HI TOB revenue target total
104112
n_ages: Number of age groups
105113
106114
Returns:
@@ -116,6 +124,8 @@ def calibrate_greg(
116124
(ss_values is not None and ss_target is not None)
117125
or (payroll_values is not None and payroll_target is not None)
118126
or (h6_income_values is not None and h6_revenue_target is not None)
127+
or (oasdi_tob_values is not None and oasdi_tob_target is not None)
128+
or (hi_tob_values is not None and hi_tob_target is not None)
119129
)
120130

121131
if needs_aux_df:
@@ -135,6 +145,14 @@ def calibrate_greg(
135145
aux_df["h6_revenue"] = h6_income_values
136146
controls["h6_revenue"] = h6_revenue_target
137147

148+
if oasdi_tob_values is not None and oasdi_tob_target is not None:
149+
aux_df["oasdi_tob"] = oasdi_tob_values
150+
controls["oasdi_tob"] = oasdi_tob_target
151+
152+
if hi_tob_values is not None and hi_tob_target is not None:
153+
aux_df["hi_tob"] = hi_tob_values
154+
controls["hi_tob"] = hi_tob_target
155+
138156
aux_vars = aux_df
139157
else:
140158
aux_vars = X
@@ -160,6 +178,10 @@ def calibrate_weights(
160178
payroll_target=None,
161179
h6_income_values=None,
162180
h6_revenue_target=None,
181+
oasdi_tob_values=None,
182+
oasdi_tob_target=None,
183+
hi_tob_values=None,
184+
hi_tob_target=None,
163185
n_ages=86,
164186
max_iters=100,
165187
tol=1e-6,
@@ -180,6 +202,10 @@ def calibrate_weights(
180202
payroll_target: Optional payroll target (for GREG with payroll)
181203
h6_income_values: Optional H6 reform income values per household
182204
h6_revenue_target: Optional H6 reform total revenue impact target
205+
oasdi_tob_values: Optional OASDI TOB revenue values per household
206+
oasdi_tob_target: Optional OASDI TOB revenue target total
207+
hi_tob_values: Optional HI TOB revenue values per household
208+
hi_tob_target: Optional HI TOB revenue target total
183209
n_ages: Number of age groups
184210
max_iters: Max iterations for IPF
185211
tol: Convergence tolerance for IPF
@@ -204,6 +230,10 @@ def calibrate_weights(
204230
payroll_target,
205231
h6_income_values,
206232
h6_revenue_target,
233+
oasdi_tob_values,
234+
oasdi_tob_target,
235+
hi_tob_values,
236+
hi_tob_target,
207237
n_ages,
208238
)
209239
except Exception as e:

policyengine_us_data/datasets/cps/long_term/run_household_projection.py

Lines changed: 58 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,19 +3,20 @@
33
44
55
Usage:
6-
python run_household_projection.py [START_YEAR] [END_YEAR] [--greg] [--use-ss] [--use-payroll] [--use-h6-reform] [--save-h5]
6+
python run_household_projection.py [START_YEAR] [END_YEAR] [--greg] [--use-ss] [--use-payroll] [--use-h6-reform] [--use-tob] [--save-h5]
77
88
START_YEAR: Optional starting year (default: 2025)
99
END_YEAR: Optional ending year (default: 2035)
1010
--greg: Use GREG calibration instead of IPF (optional)
1111
--use-ss: Include Social Security benefit totals as calibration target (requires --greg)
1212
--use-payroll: Include taxable payroll totals as calibration target (requires --greg)
1313
--use-h6-reform: Include H6 reform income impact ratio as calibration target (requires --greg)
14+
--use-tob: Include TOB (Taxation of Benefits) revenue as calibration target (requires --greg)
1415
--save-h5: Save year-specific .h5 files with calibrated weights to ./projected_datasets/
1516
1617
Examples:
1718
python run_household_projection.py 2045 2045 --greg --use-ss # single year
18-
python run_household_projection.py 2025 2100 --greg --use-ss --use-payroll --use-h6-reform --save-h5
19+
python run_household_projection.py 2025 2100 --greg --use-ss --use-payroll --use-tob --save-h5
1920
"""
2021

2122
import sys
@@ -256,6 +257,16 @@ def create_h6_reform():
256257
USE_GREG = True
257258
from ssa_data import load_h6_income_rate_change
258259

260+
USE_TOB = "--use-tob" in sys.argv
261+
if USE_TOB:
262+
sys.argv.remove("--use-tob")
263+
if not USE_GREG:
264+
print(
265+
"Warning: --use-tob requires --greg, enabling GREG automatically"
266+
)
267+
USE_GREG = True
268+
from ssa_data import load_oasdi_tob_projections, load_hi_tob_projections
269+
259270
SAVE_H5 = "--save-h5" in sys.argv
260271
if SAVE_H5:
261272
sys.argv.remove("--save-h5")
@@ -286,6 +297,8 @@ def create_h6_reform():
286297
print(f" Including taxable payroll constraint: Yes")
287298
if USE_H6_REFORM:
288299
print(f" Including H6 reform income impact constraint: Yes")
300+
if USE_TOB:
301+
print(f" Including TOB revenue constraint: Yes")
289302
if SAVE_H5:
290303
print(f" Saving year-specific .h5 files: Yes (to {OUTPUT_DIR}/)")
291304
os.makedirs(OUTPUT_DIR, exist_ok=True)
@@ -464,6 +477,33 @@ def create_h6_reform():
464477
del reform_sim
465478
gc.collect()
466479

480+
oasdi_tob_values = None
481+
oasdi_tob_target = None
482+
hi_tob_values = None
483+
hi_tob_target = None
484+
if USE_TOB:
485+
oasdi_tob_hh = sim.calculate(
486+
"tob_revenue_oasdi", period=year, map_to="household"
487+
)
488+
oasdi_tob_values = oasdi_tob_hh.values
489+
oasdi_tob_target = load_oasdi_tob_projections(year)
490+
491+
hi_tob_hh = sim.calculate(
492+
"tob_revenue_medicare_hi", period=year, map_to="household"
493+
)
494+
hi_tob_values = hi_tob_hh.values
495+
hi_tob_target = load_hi_tob_projections(year)
496+
497+
if year in display_years:
498+
oasdi_baseline = np.sum(oasdi_tob_values * baseline_weights)
499+
hi_baseline = np.sum(hi_tob_values * baseline_weights)
500+
print(
501+
f" [DEBUG {year}] OASDI TOB baseline: ${oasdi_baseline/1e9:.1f}B, target: ${oasdi_tob_target/1e9:.1f}B"
502+
)
503+
print(
504+
f" [DEBUG {year}] HI TOB baseline: ${hi_baseline/1e9:.1f}B, target: ${hi_tob_target/1e9:.1f}B"
505+
)
506+
467507
y_target = target_matrix[:, year_idx]
468508

469509
w_new, iterations = calibrate_weights(
@@ -478,13 +518,19 @@ def create_h6_reform():
478518
payroll_target=payroll_target,
479519
h6_income_values=h6_income_values,
480520
h6_revenue_target=h6_revenue_target,
521+
oasdi_tob_values=oasdi_tob_values,
522+
oasdi_tob_target=oasdi_tob_target,
523+
hi_tob_values=hi_tob_values,
524+
hi_tob_target=hi_tob_target,
481525
n_ages=n_ages,
482526
max_iters=100,
483527
tol=1e-6,
484528
verbose=False,
485529
)
486530

487-
if year in display_years and (USE_SS or USE_PAYROLL or USE_H6_REFORM):
531+
if year in display_years and (
532+
USE_SS or USE_PAYROLL or USE_H6_REFORM or USE_TOB
533+
):
488534
if USE_SS:
489535
ss_achieved = np.sum(ss_values * w_new)
490536
print(
@@ -507,6 +553,15 @@ def create_h6_reform():
507553
print(
508554
f" [DEBUG {year}] H6 achieved revenue: ${h6_revenue_achieved/1e9:.3f}B (error: {error_pct:.1f}%)"
509555
)
556+
if USE_TOB:
557+
oasdi_achieved = np.sum(oasdi_tob_values * w_new)
558+
hi_achieved = np.sum(hi_tob_values * w_new)
559+
print(
560+
f" [DEBUG {year}] OASDI TOB achieved: ${oasdi_achieved/1e9:.1f}B (error: {(oasdi_achieved - oasdi_tob_target)/oasdi_tob_target*100:.1f}%)"
561+
)
562+
print(
563+
f" [DEBUG {year}] HI TOB achieved: ${hi_achieved/1e9:.1f}B (error: {(hi_achieved - hi_tob_target)/hi_tob_target*100:.1f}%)"
564+
)
510565

511566
weights_matrix[:, year_idx] = w_new
512567
baseline_weights_matrix[:, year_idx] = baseline_weights

policyengine_us_data/datasets/cps/long_term/ssa_data.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,3 +89,39 @@ def load_h6_income_rate_change(year):
8989
row = df[df["year"] == year]
9090
# CSV stores as percentage (e.g., -0.18), convert to decimal
9191
return row["h6_income_rate_change"].values[0] / 100
92+
93+
94+
def load_oasdi_tob_projections(year):
95+
"""
96+
Load OASDI TOB (Taxation of Benefits) revenue target for a given year.
97+
98+
Args:
99+
year: Year to load OASDI TOB revenue for
100+
101+
Returns:
102+
Total OASDI TOB revenue in nominal dollars
103+
"""
104+
csv_path = STORAGE_FOLDER / "social_security_aux.csv"
105+
df = pd.read_csv(csv_path)
106+
107+
row = df[df["year"] == year]
108+
nominal_billions = row["oadsi_tob_billions_nominal_usd"].values[0]
109+
return nominal_billions * 1e9
110+
111+
112+
def load_hi_tob_projections(year):
113+
"""
114+
Load HI (Medicare) TOB revenue target for a given year.
115+
116+
Args:
117+
year: Year to load HI TOB revenue for
118+
119+
Returns:
120+
Total HI TOB revenue in nominal dollars
121+
"""
122+
csv_path = STORAGE_FOLDER / "social_security_aux.csv"
123+
df = pd.read_csv(csv_path)
124+
125+
row = df[df["year"] == year]
126+
nominal_billions = row["hi_tob_billions_nominal_usd"].values[0]
127+
return nominal_billions * 1e9

0 commit comments

Comments
 (0)