-
Notifications
You must be signed in to change notification settings - Fork 118
Potential input checking and error handling improvements
Ben Stabler edited this page Mar 24, 2021
·
22 revisions
Work-in-progress
The purpose of the following potential improvements is to make it easier to stand up and use an ActivitySim model. This list of improvements is in response to phase 6a task 7.
| Idea | Level-of-effort | Notes | Priority |
|---|---|---|---|
| Check input data consistency | Days | Plan to check all the key relationships. Check primary key table joins across input tables - HH home zone vs. TAZ/MAZ land use file, MAZtoTAP file TAPs vs. TAP skims, TAZs in the land use file vs. TAZ skims, etc. | High |
| Check for well formed config files | Days | Check settings files, expression files, and config files at the start of a run to avoid parsing issues later on. | Low |
| Better describe silly errors | Days depending on LOE | For example, when a submodel yaml file cannot be found, print more useful info through more exception handling | Low |
| Print info about no alternative found | Hours | When no alternative is found in a choice model, log which terms in the utility turn off all alternatives since this is really helpful info in debugging. | High |
| Turn of restarting if desired | Hours | Add a global setting to optionally turn off restarting in order to save disk space (and some runtime) by not saving all the states of the tables in the pipeline and instead just save one version of each table This is especially useful when running lots of model scenarios in application mode. | Medium |
| Improve RAM/chunking settings | Days depending on LOE | Make it easier for new users to set a reasonable chunking/processors settings since this is tricky. Can we create better defaults or just improve the documentation? Maybe specify an amount of overall machine RAM to use instead of a chunksize? | High |
| Eliminate distance==0 problems | Days depending on LOE | Would need users to identify distance skims in the settings files since ActivitySim doesn't know what "distance" is and then ensure it is > 0 where "appropriate" to avoid funny simulation situations. Alternatively the skim reader could check for the diagonal sum of skims as being 0, or for rows or columns summing to 0, or for the MAZtoMAZ or MAZtoTAP tables summing to 0 is certain ways. | Medium |
| Clean up command prompt logging | Hours depending on LOE | Be more intentional about what is logged to the console vs. log file since there's a lot of info currently. | Medium |
| Avoid long trace file names | Hours | Reduce filenames (especially in estimation mode) because of Windows filename limits. | Low |
| Tidy up documentation related to settings file inheritance | Hours | Settings file inheritance is used by several examples, but it can be confusing to new users. | Low |
| Log dependent package versions | Hours | This is useful for debugging, especially since different versions of pytables (for hdf5) behave a little bit differently between versions and OSes | Low |
| Log time to solve each expression | Hours | To identify which expressions take a long time to run in order to hopefully rewrite them in faster form | Low |