Currently we only have peopel_test.csv, with no method of generating larger populations. We need a way to generate synthetic populations at scale, and it would be desirable to do this on the fly with data caching. This is entirely necessary for Phase 2 and Phase 1 would have benefitted from being able to utilize these populations.