Estimation

Prepare ACS data

Use csv2sqlite.py to save the raw csv.gz data file into an sqlite database.

acs: table from ACS microdata
- python csv2sqlite.py --gzip acs_08-16.csv.gz acs_08-16.db acs
mig2met: table to convert migration state/puma to puma
- run data-prep.r to build the csv from the two csv files mapping puma and msa
- then load it into the sqlite database as another table
- python csv2sqlite.py mig2met.csv acs_08-16.db mig2met Then use SQL queries to get aggregated values to avoid loading the entire dataset into memory. Queries apply categorizations (race, edu) on-the-fly, so no need to pre-clean the data.

Estimation of model objects

set up data: check these files for correct filenames per model (different specifications by age and type)
smooth-pops.r: query and smooth aggregated population counts in each desired metro
- Total/single/married populations, marriage/divorce flows, migration flows
- Smoothing by non-parametric regression (local-polynomial): using hand-rolled "diagonal" smoothing kernel (manual bandwidth)
- Saves smoothed data to csv for loading into julia
mort-rates.r: interpolates and saves death rates
main-estim.jl: runs the show, but need to set options first
- loads populations from saved JLD files, or calls prepare-pops.jl to generate them anew
- prepare-pops.jl: loads csv files generated by R scripts above, then converts DataFrames to multidimensional arrays (per metro) and saves as JLD files
- estimate arrival rates and then non-parametric objects using estim-functions.jl and compute-npobj.jl
- can also do a parameter grid search or monte carlo estimation
plot-results.r: plot model-data fit and estimated objects
- tikz-conversion.R: produce tikz figures from saved plot objects

Bootstrap Standard Errors

Run scripts in order to set up resampled datasets, run smoothing, and then estimation. Uses GNU Parallel for efficient batch processing.

Rscript bootstrap-resampler.r: creates directories data/bootstrap-samples/resamp_00 with resampled csv data
bash bootstrap-create-db.sh: creates sqlite db from csv files
bash bootstrap-smooth.sh: runs smooth-data.r for both ageonly and racedu specifications
- Took 40 hours for 100 resamples on 8 cores, low memory usage (<2GB)
bash bootstrap-cp-psi.sh: copies the death rate data into the smoothed populations directories for each resample
bash bootstrap-estim.sh: runs main-estim.jl for both ageonly and racedu specifications
- Took 100 minutes for 100 resamples on 8 cores, low memory usage (<4GB)

Largest metro areas by adult population (millions)

35620: 14.5m - New York-Newark-Jersey City, NY-NJ-PA
31080: 9.4m - Los Angeles-Long Beach-Anaheim, CA
16980: 6.8m - Chicago-Naperville-Elgin, IL-IN-WI
19100: 4.6m - Dallas-Fort Worth-Arlington, TX
37980: 4.4m - Philadelphia-Camden-Wilmington, PA-NJ-DE-MD
26420: 4.2m - Houston-The Woodlands-Sugar Land, TX
47900: 4.1m - Washington-Arlington-Alexandria, DC-VA-MD-WV
33100: 4.1m - Miami-Fort Lauderdale-West Palm Beach, FL
12060: 3.8m - Atlanta-Sandy Springs-Roswell, GA
14460: 3.5m - Boston-Cambridge-Newton, MA-NH
41860: 3.3m - San Francisco-Oakland-Hayward, CA
19820: 3.1m - Detroit-Warren-Dearborn, MI
38060: 3.1m - Phoenix-Mesa-Scottsdale, AZ
40140: 3.0m - Riverside-San Bernardino-Ontario, CA
42660: 2.6m - Seattle-Tacoma-Bellevue, WA
33460: 2.4m - Minneapolis-St. Paul-Bloomington, MN-WI
41740: 2.3m - San Diego-Carlsbad, CA
45300: 2.1m - Tampa-St. Petersburg-Clearwater, FL
41180: 2.0m - St. Louis, MO-IL
12580: 2.0m - Baltimore-Columbia-Towson, MD

30th metro is 1.4m, 40th is 0.9m.

(Deprecated) Estimation of rate parameters by OLS

estimate-rates-full.r and estimate-rates.r
- very poor accuracy due to noisy inference on divorce flows
Need marriage and divorce rates for each couple-type (globally)
- Marriage rate (directly observable): SQL queries for flows and stocks to compute rates
- Divorce rate (infer from non-divorce rate and death rate)
Weighted OLS (by stocks of couples)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Estimation

Prepare ACS data

Estimation of model objects

Bootstrap Standard Errors

Largest metro areas by adult population (millions)

(Deprecated) Estimation of rate parameters by OLS

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
data		data
.gitignore		.gitignore
README.md		README.md
bootstrap-cp-psi.sh		bootstrap-cp-psi.sh
bootstrap-create-db.sh		bootstrap-create-db.sh
bootstrap-estim.sh		bootstrap-estim.sh
bootstrap-resampler.r		bootstrap-resampler.r
bootstrap-smooth.sh		bootstrap-smooth.sh
compute-npobj.jl		compute-npobj.jl
data-prep.r		data-prep.r
estim-functions.jl		estim-functions.jl
estimate-rates-full.r		estimate-rates-full.r
estimate-rates.r		estimate-rates.r
local-regression.r		local-regression.r
main-estim.jl		main-estim.jl
monte-carlo-static.jl		monte-carlo-static.jl
mort-rates.r		mort-rates.r
plot-examples.r		plot-examples.r
plot-results.r		plot-results.r
prepare-pops.jl		prepare-pops.jl
smooth-data.r		smooth-data.r
static-npobj.jl		static-npobj.jl
tikz-conversion.r		tikz-conversion.r

Folders and files

Latest commit

History

Repository files navigation

Estimation

Prepare ACS data

Estimation of model objects

Bootstrap Standard Errors

Largest metro areas by adult population (millions)

(Deprecated) Estimation of rate parameters by OLS

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages