Conversation
a46c604 to
fe5c663
Compare
|
Okay @mtsao96 I think I fixed it, I'm just trying to come up with a whole range of edge cases to test against. So if you got any, let me know and I can try to add them. Or try to add some yourself. |
|
Do we re-categorized PersonType for MandatoryLocation? I re-ran the pipeline with the fixes and it looks like the StudentCategory for MandatoryLocation still shows people age 10 and 16 as not a student. |
|
Hmm. I'll take another look. Can you get me some offending person IDs? |
|
Yup, here are some person IDs. They're categorized as not employed and not a student even though person age is either 10 or 16:
|
|
Wait, are we working with the same codebook? Because here's what I'm getting
|
|
Yeah that was the weird thing I noticed. The PersonData had the correct categorization but the MandatoryLocation had different results.
|
|
Ah I see, i was focused on the person table. I think your hunch might be right, the re-categorizing again within mandatory loc format is the problem because it reuses the formatted person table, but we've converted age code to continuous! I'll make the fix and push it up |
|
Bingo. Just goin gto clean up and push it up
|
d0615e2 to
bb56aaa
Compare
|
Tried to re-run the pipeline and i think there's an error with the student_category field. The type field is correct but the student_category is categorizing most of them as "not a student"
|
|
okay about to push up a fix. I think its because they had NA in student or school_type, but if they'r eunder 16 we can assume student |
|
I believe it worked on my end! I had a follow up question looking at the outputs: There is a small number of people (59) who are categorized as full-time/part-time work, nonworker, retired, and child too young for school but still have a School Location TAZ. Are these people who are not full-time students but are taking classes somewhere? |
|
Oh something else I noticed. In the MandatoryLocation file, the income output was a bit weird. Since we're pulling the income already from the formatted CTRAMP household DataFrame, do we need to include this calculation when mapping the results to CTRAMP column names? |
|
hmmm. Can you find me a couple of offending person IDs? I'll take a look, this is good stuff though, catching all these weird edge cases. |
|
Yup! Here are some of the IDs with school locations but categorized as not a student. PersonType = 8 (Child Too Young for School) makes up majority of the people (339 out of the 419)
419 rows × 13 columns |
|
@mtsao96 so if they are too young for school, should they be not a student? I think a lot are maybe daycare things. We can either:
|
|
I think we can just leave it as is, i.e. so leave them as not a student and keep the school TAZ. We're not summarizing them in the UsualWorkSchoolLocation model so it doesn't impact anything there. (No CTRAMP modification please 😭 ) |
|
Okay sounds good. I think in school type it has daycare/preschool so they can at least be ID. But I'll fix cases where there are >5 y/os getting not a student. I think that leaves about 336 or so people < 5 |
|
Found another edge case now 😅 where there are ~41 people with the status as full-time employee and college or higher student, but they only have a School Location and no Work Location
|
|
Well wait, people can have no work location and be workers, they just dont have a fixed locaiton right? |
|
Yeah people can be a full-time employee and not have a work location. But I think I only called this out because all the people that are full workers with no work locations were all classified as college students too. |
|
AH i see. PersonType is full time. Okay. I think we can use the canonical student vs employment as tie breaker. |
Description
Type of Change
Testing
Checklist
ruff check .without errors)ruff format .)Related Issues
Closes #
Additional Notes