Skip to content

Potential input checking and error handling improvements

Ben Stabler edited this page Mar 23, 2021 · 22 revisions

Work-in-progress

Purpose

The purpose of the following potential improvements is to make it easier to stand up and use an ActivitySim model. This list of improvements is in response to phase 6a task 7.

Potential Improvements

Idea Level-of-effort Notes Priority
Check input data consistency Days Plan to check all the key relationships. Check primary key table joins across input tables - HH home zone vs. TAZ/MAZ land use file, MAZtoTAP file TAPs vs. TAP skims, TAZs in the land use file vs. TAZ skims, etc. High
Check for well formed config files Days Check settings files, expression files, and config files at the start of a run to avoid parsing issues later on. Low
Better describe silly errors Days depending on LOE For example, when a submodel yaml file cannot be found, print more useful info through more exception handling Low
Print info about no alternative found Hours When no alternative is found in a choice model, log which terms in the utility turn off all alternatives since this is really helpful info in debugging. High
Reduce pipeline file size Weeks Instead of checkpointing the complete table in the pipeline after each submodel, only add the additional fields and/or rows in order to reduce file size. Requires significant design work. Low
From String Data to Categorical Data Days Switch string type data to categorical type data to improve runtime and reduce pipeline file size Low
Improve RAM/chunking settings Days depending on LOE Make it easier for new users to set a reasonable chunking/processors settings since this is tricky. Can we create better defaults or just improve the documentation? High
Eliminate distance==0 problems Days depending on LOE Would need users to identify distance skims in the settings files since ActivitySim doesn't know what "distance" is and then ensure it is > 0 where "appropriate" to avoid funny simulation situations. Alternatively the skim reader could check for the diagonal sum of skims as being 0, or for rows or columns summing to 0, or for the MAZtoMAZ or MAZtoTAP tables summing to 0 is certain ways. Medium
Clean up command prompt logging Hours depending on LOE Be more intentional about what is logged to the console vs. log file since there's a lot of info currently. Medium
Avoid long trace file names Hours Reduce filenames (especially in estimation mode) because of Windows filename limits. Low
Tidy up documentation related to settings file inheritance Hours Settings file inheritance is used by several examples, but it can be confusing to new users. Low
Log dependent package versions Hours This is useful for debugging Low

Clone this wiki locally