-
Notifications
You must be signed in to change notification settings - Fork 10
Enhance SSN undocumented type imputation #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
How many undocumented do we get at the end of it (in the CPS)? |
|
Ok we need to adjust to get the 11 million Pew estimates (might be more today) |
I adjusted the code to get the 11 million Pew estimates and the 2 million JCT estimates for the reconciliation reform. |
|
Also depending on the target year let's target the total undocumented population per these projections:
|
I've replaced the fixed 11 million target with dynamic targets. |
nikhilwoodruff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is getting quite complex, could you add a page to the documentation with descriptions of the methodology, and some statistics of our results? e.g. splits of status counts.
I added documentation. Once the implementation is reviewed, I will complete the doc with results. |
Remove git conflict markers and keep both test functions. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Add explicit UTF-8 encoding when reading/writing documentation files to prevent UnicodeDecodeError on Windows systems. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
|
Please make the family adjustment probabilistic. It should only move people into undocumented status enough to hit the overall target in the CPS, if we don't have enough without it. Millions of US citizen children live with an undocumented parent ("mixed-status families") and the current algorithm results in zero such cases AFAIUI. Also for reference here's the output from CI: |
Addressed. Now we have the family adjustment probabilistic. |
|
Documentation looks nice! |
MaxGhenis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! And I misspoke earlier: we're not adjusting citizenship so won't get the wrong mixed-status household composition. But still good to do the other parts probabilistically since mixed-SSN-category households may exist.
| cps: h5py.File, | ||
| person: pd.DataFrame, | ||
| spm_unit: pd.DataFrame, | ||
| undocumented_target: float = 11e6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plus a flat 13/11 multiplier for next two
| undocumented_target: float = 11e6, | |
| undocumented_target: float = 13e6, |
|
|
||
| if target_weighted_ead_workers > 0 and len(worker_ids) > 0: | ||
| # Sort workers by weight (heaviest first) to minimize assignments needed | ||
| worker_weights = person_weights[worker_ids] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
| total_weighted_workers - undocumented_workers_target | ||
| ) | ||
|
|
||
| if target_weighted_ead_workers > 0 and len(worker_ids) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create a function to modularize the targeting here: for each of {workers, students, all others}, compare the total potential undocumented to the undocumented target, then assign enough to documented to align them


Fixes #246
Implement ASEC Undocumented Algorithm (paper)
Algorithm Logic: Process of Elimination
Target Implementation
Modify the existing
add_ssn_card_type()function to apply these conditions before the random refinement step, ensuring that people meeting any of these conditions are assigned to code 3 ("OTHER_NON_CITIZEN") rather than potentially remaining as code 0 ("NONE"/undocumented).The 14 Conditions
Condition 1: Pre-1982 Arrivals
PEINUSYRcodes 1-7 (Before 1950 through 1980-1981)Condition 2: Eligible Naturalized Citizens
PRCITSHP == 4,A_AGE >= 18,PEINUSYR(for years in US),A_MARITL,A_SPOUSECondition 3: Medicare Recipients
MCARE == 1Condition 4: Federal Retirement Benefits
PEN_SC1 == 3ORPEN_SC2 == 3(Federal government pension)Condition 5: Social Security Disability
RESNSS1 == 2ORRESNSS2 == 2(disabled adult or child)Condition 6: Indian Health Service Coverage
IHSFLG == 1Condition 7: Medicaid Recipients (State-specific adjustments needed)
CAID == 1,GESTFIPS(for state-specific rules)Condition 8: CHAMPVA Recipients
CHAMPVA == 1Condition 9: Military Health Insurance
MIL == 1Condition 10: Government Employees
PEIO1COWcodes 1-3 (federal/state/local gov) ORA_MJOCC == 11(military)Condition 11: Social Security Recipients
SS_YN == 1Condition 12: Housing Assistance (State-specific adjustments needed)
HPUBLIC == 1ORHLORENT == 1,GESTFIPSCondition 13: Veterans/Military Personnel
PEAFEVER == 1ORA_MJOCC == 11Condition 14: SSI Recipients
SSI_YN == 1,RESNSSI1/RESNSSI2(to verify recipient)Additional Steps
Family Correlation Adjustment