In order to compare our symbolic execution engine with others, we need to figure out what metrics they use to report their progress. Essentially, we've seen that in papers, researchers typically report things like line coverage and branch coverage. This issue can be broken down into the following:
- First, we need to figure out which metrics to generate. I figure this will be some combination of branch / line coverage, but this assumption should be doublechecked
- Second, we will need to parse LLVM debug information in order to map instructions <=> source code lines (see llvm debug info docs)
- Third, metrics will need to be collected in the interpreter, possibly using some global datastructure that is owned by an Executor
- Finally, these metrics will need to be written out into a common file format (maybe something common like gcov which is documented at gcov-io.h)