-
Couldn't load subscription status.
- Fork 58
Description
It should be relatively straightforward to create a tool that can determine many of the code constructs that are causing a specific tool problems. Using the expected results full details file and yaml file from a generated test suite, and the actual results for a particular tool (from BenchmarkScore), do something like:
Generate a list of every code snippet used to generate that test suite (Straight from the YAML file?).
Create bidirectional data structure like so:
-
Create a data structure for each code snippet that has links to every test case that uses it.
-
Create a data structure for each test case that has links to every code snippet used in it.
-
Pass 1: Go through each True Positive detected by the tool and mark each code snippet used in it as [correctly understood.] (i.e., both or all 3)
-
Pass 2: Go through each test case and identify any where only 1 of the snippets left is not 'understood' and generate lists for the sources, dataflows, and sinks to focus on.
-
Sanity Check: See if there are any test cases with all snippets checked, but the tool reports a False Positive.
-
Once this is working, update tool to automatically calculate this for every actual results file in the /scorecard directory. (i.e., do this for ALL tools)
I was thinking this might require multiple analysis phases, but I think that's it?
Stage 2:
- Do something similar to detect False Positive problem areas, but analyzing what they report as TPs (but are FPs).