-
Notifications
You must be signed in to change notification settings - Fork 118
Potential input checking and error handling improvements
Ben Stabler edited this page Mar 23, 2021
·
22 revisions
Work-in-progress
The purpose of the following potential improvements is to make it easier to stand up and use an ActivitySim model. This list of improvements is in response to phase 6a task 7.
| Idea | Level-of-effort | Notes | Priority |
|---|---|---|---|
| Check input data consistency | Days | Plan to check all the key relationships. Check primary key table joins across input tables - HH home zone vs. TAZ/MAZ land use file, MAZtoTAP file TAPs vs. TAP skims, TAZs in the land use file vs. TAZ skims, etc. | High |
| Check for well formed config files | Days | Check settings files, expression files, and config files at the start of a run to avoid parsing issues later on. | Low |
| Better describe silly errors | Days depending on LOE | For example, when a submodel yaml file cannot be found, print more useful info through more exception handling | Low |
| Print info about no alternative found | Hours | When no alternative is found in a choice model, log which terms in the utility turn off all alternatives since this is really helpful info in debugging. | High |
| Reduce pipeline file size | Weeks | Instead of checkpointing the complete table in the pipeline after each submodel, only add the additional fields and/or rows in order to reduce file size. Requires significant design work. | Low |
| From String Data to Categorical Data | Days | Switch string type data to categorical type data to improve runtime and reduce pipeline file size | Low |
| Improve RAM/chunking settings | Days depending on LOE | Make it easier for new users to set a reasonable chunking/processors settings since this is tricky. Can we create better defaults or just improve the documentation? | High |
| Eliminate distance==0 problems | Days depending on LOE | Would need users to identify distance skims in the settings files since ActivitySim doesn't know what "distance" is and then ensure it is > 0 where "appropriate" to avoid funny simulation situations. Alternatively the skim reader could check for the diagonal sum of skims as being 0, or for rows or columns summing to 0, or for the MAZtoMAZ or MAZtoTAP tables summing to 0 is certain ways. | Medium |
| Clean up command prompt logging | Hours depending on LOE | Be more intentional about what is logged to the console vs. log file since there's a lot of info currently. | Medium |
| Avoid long trace file names | Hours | Reduce filenames (especially in estimation mode) because of Windows filename limits. | Low |
| Tidy up documentation related to settings file inheritance | Hours | Settings file inheritance is used by several examples, but it can be confusing to new users. | Low |