You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* use modern pyproject package definition
* pin python to 3.8 for now
* Cleanup test runners
* Allow hard constraints on balancer.
* minor performance enhancements and bug fixes.
* repop fix
* add oceanside repop example
* Fix oceanside inputs
* bugfix summarize empty
* Major cleanup of tests and examples to share data and configs.
* Pytests gitaction
* Minor update to test gha
* cleanup gha testing
* disable linting for now.
* bugfix weighting test
* simplify test_steps
* Normalize df to hash
* debug data hash.
* debug
* test
* more debug
* test sorting
* debug
* further sort
* debug
* debugging
* more debug
* Revert "more debug"
This reverts commit 0972171.
* Revert "debugging"
This reverts commit b86cf01.
* debug
* more debug...
* Linux - Windowx ortools bugfix.
* Cleanup tests and stabilize.
* Working refactor of activitysim pipeline into populationsim
* linting
* Possible fix for repop error.
* Cleanup unused code
* cleanup dependencies and test python versions
* Cleanup imports
* Pinned versions to work with python 3.12
* Dropped support for Python 3.13 because ortools must be <=3.12
* Cleaned up future warnings, expanded tests, and resurrected the lp_cvx option.
* iter version
* Add pre-commit
* Fixed test bug.
* Import bugfix
* Numba balancer
* Implemented Numba for significant perf improvement. Need to cleanup SimultanousListBalancer.
* Test fix. But needs organizing in sub_balance and do_balance.
* Update test_balancer.py
* cleanup uv lock
* Organize into modules
* split numba functions
* fixed import paths
* more organizing
* Added configurable optimizer timeout parameter in settings. Also further cleanup.
* Cleanup unused code.
* Added CLI option
* Bugfix CLI option
* Bugfixes
* Revert "Bugfixes"
This reverts commit 4474f60.
* Bugfix max delta
* Hardcode constants instead of as args
* Update pyproject.toml
* Fixed issue #196
Copy file name to clipboardExpand all lines: docs/application_configuration.rst
+24-24Lines changed: 24 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -121,7 +121,7 @@ PopulationSim is configured using the settings.yaml file. PopulationSim can be c
121
121
122
122
:regular mode:
123
123
124
-
The regular configuration runs PopulationSim from beginning to end and produces a new synthetic population. This can run either single-process or multi-processed to save on runtime.
124
+
The regular configuration runs PopulationSim from beginning to end and produces a new synthetic population. This can run either single-process or multi-processed to save on runtime.
125
125
126
126
:repop mode:
127
127
@@ -263,17 +263,17 @@ This sub-directory is populated at the end of the PopulationSim run. The table b
263
263
Configuring Settings File
264
264
~~~~~~~~~~~~~~~~~~~~~~~~~
265
265
266
-
PopulationSim is configured using the *configs/settings.yaml* file. The user has the flexibility to specify algorithm functionality, list geographies, invoke tracing, provide inputs specifications, select outputs, list the steps to run, and specify multiprocess settings.
266
+
PopulationSim is configured using the *configs/settings.yaml* file. The user has the flexibility to specify algorithm functionality, list geographies, invoke tracing, provide inputs specifications, select outputs, list the steps to run, and specify multiprocess settings.
267
267
268
268
.. note::
269
-
When running PopulationSim, multiple settings files can be specified so long as the ``inherit_settings: True`` setting is included in
269
+
When running PopulationSim, multiple settings files can be specified so long as the ``inherit_settings: True`` setting is included in
270
270
subsequent files. This feature is used for the multi-processing configuration described below. To utilize this feature, once can run PopulationSim
271
-
with the following command: ``python run_populationsim.py -c configs_mp -c configs``. This command specifies two config folders, each with
271
+
with the following command: ``python run_populationsim.py -c configs_mp -c configs``. This command specifies two config folders, each with
272
272
a settings file, and the ``configs_mp`` settings inherit from the earlier ``configs`` settings.
273
273
274
274
The settings shown below are from the PopulationSim application for the CALM region as an example of how a run can be configured. The meta geography for CALM region is named as *Region*, the seed geography is *PUMA* and the two sub-seed geographies are *TRACT* and *TAZ*. The settings below are for this four geography application, but the user can configure PopulationSim for any number of geographies and use different geography names.
275
275
276
-
Some of the setting are configured differently for the *repop* mode. The settings specific to the *repop* mode are described in the :ref:`settings_repop` section. The settings specific to the *multiprocessing* setup are described in the :ref:`settings_mp` section.
276
+
Some of the setting are configured differently for the *repop* mode. The settings specific to the *repop* mode are described in the :ref:`settings_repop` section. The settings specific to the *multiprocessing* setup are described in the :ref:`settings_mp` section.
277
277
278
278
**Algorithm/Software Configuration**:
279
279
@@ -395,11 +395,11 @@ Note that Seed-Households, Seed-Persons and Geographic CrossWalk are all require
395
395
- tablename: households
396
396
filename : seed_households.csv
397
397
index_col: hh_id
398
-
column_map:
398
+
rename_columns:
399
399
hhnum: hh_id
400
400
- tablename: persons
401
401
filename : seed_persons.csv
402
-
column_map:
402
+
rename_columns:
403
403
hhnum: hh_id
404
404
SPORDER: per_num
405
405
# drop mixed type fields that appear to have been incorrectly generated
@@ -414,7 +414,7 @@ Note that Seed-Households, Seed-Persons and Geographic CrossWalk are all require
414
414
- naicsp07
415
415
- tablename: geo_cross_walk
416
416
filename : geo_cross_walk.csv
417
-
column_map:
417
+
rename_columns:
418
418
TRACTCE: TRACT
419
419
- tablename: TAZ_control_data
420
420
filename : control_totals_taz.csv
@@ -454,7 +454,7 @@ Note that Seed-Households, Seed-Persons and Geographic CrossWalk are all require
@@ -859,7 +859,7 @@ Some conventions for writing expressions:
859
859
* Expressions must be vectorized expressions and can use most numpy and pandas expressions.
860
860
* When editing the CSV files in Excel, use single quote ' or space at the start of a cell to get Excel to accept the expression
861
861
862
-
.. _importance:
862
+
.. _importance:
863
863
864
864
What are importance weights
865
865
~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -882,18 +882,18 @@ Where, :math:`z_{i}` are relaxation factors and :math:`a_{in}` are incidence val
882
882
883
883
Where, :math:`u_{i}` are the penalties termed as importance factors or importance weights in PopulationSim.
884
884
885
-
:math:`x_{n}` and :math:`z_{i}` are the parameters solved by the optimization while importance weights (:math:`u_{i}`) are the hyperparameters that are exposed to the user and impact the optimization externally. The objective of the relative entropy optimization is to find a set of weights that are uniform and satisfy marginal controls. The importance weights allow the user to trade-off between these objectives. High importance weights (e.g., 1E10) on all controls result in a hard constrained optimization which gives a high preference to matching marginal controls. Low importance weights (e.g., <50) results in an almost unconstrained problem. The user may also specify different importance weights for each marginal control. In this case, the controls with higher importance weights are given preference over the ones with low importance weights. Therefore, both absolute and relative value of the importance weights impacts the optimization problem and the solution.
885
+
:math:`x_{n}` and :math:`z_{i}` are the parameters solved by the optimization while importance weights (:math:`u_{i}`) are the hyperparameters that are exposed to the user and impact the optimization externally. The objective of the relative entropy optimization is to find a set of weights that are uniform and satisfy marginal controls. The importance weights allow the user to trade-off between these objectives. High importance weights (e.g., 1E10) on all controls result in a hard constrained optimization which gives a high preference to matching marginal controls. Low importance weights (e.g., <50) results in an almost unconstrained problem. The user may also specify different importance weights for each marginal control. In this case, the controls with higher importance weights are given preference over the ones with low importance weights. Therefore, both absolute and relative value of the importance weights impacts the optimization problem and the solution.
886
886
887
-
.. _setting-importance:
887
+
.. _setting-importance:
888
888
889
889
Setting importance weights
890
890
~~~~~~~~~~~~~~~~~~~~~~~~~~~
891
891
892
892
Given the flexibility that importance weights offer to the user, they need to be tuned to get the desired optimality in the outputs for the given seed sample and marginal controls. The quality of the outputs is defined by a uniformity measure of the weights and goodness of fit across marginal controls. Here are general guidelines on setting importance weights:
893
893
894
894
* Start with a reasonable importance factor value across all controls (e.g., 1000 has typically worked well for multiple regions). This excludes the control on the total number of households which should be set to very high importance to ensure that the right number of households is generated for each zone.
895
-
* After achieving reasonable goodness of fit across controls, the importance weights can be increased/decreased to favor one control over the other, or all importance weights can be reduced to improve the uniformity of the weights. Which controls to favor depends on the type of application and the quality of the marginal data.
896
-
* The importance weights are generally updated in factors of 10. The user may need to run PopulationSim multiple times using various combinations of importance weights to reach the desired quality of outputs.
895
+
* After achieving reasonable goodness of fit across controls, the importance weights can be increased/decreased to favor one control over the other, or all importance weights can be reduced to improve the uniformity of the weights. Which controls to favor depends on the type of application and the quality of the marginal data.
896
+
* The importance weights are generally updated in factors of 10. The user may need to run PopulationSim multiple times using various combinations of importance weights to reach the desired quality of outputs.
0 commit comments