Skip to content

Commit 62baf02

Browse files
committed
Add clarification to how the Overall Score is calculated for each tool.
1 parent ae0e333 commit 62baf02

6 files changed

+25
-11
lines changed

scorecard/Benchmark_v1.2beta_Scorecard_for_FBwFindSecBugs.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,9 @@ <h2>Detailed Results</h2>
125125
<tr ><td>Weak Hash Algorithm</td><td>89</td><td>40</td><td>107</td><td>0</td><td>236</td><td>68.99%</td><td>0.00%</td><td>68.99%</td></tr>
126126
<tr class="success"><td>Weak Random Number</td><td>218</td><td>0</td><td>275</td><td>0</td><td>493</td><td>100.00%</td><td>0.00%</td><td>100.00%</td></tr>
127127
<tr class="danger"><td>XPath Injection</td><td>15</td><td>0</td><td>0</td><td>20</td><td>35</td><td>100.00%</td><td>100.00%</td><td>0.00%</td></tr>
128-
<th>Totals</th><th>1028</th><th>387</th><th>791</th><th>534</th><th>2740</th><th>77.67%</th><th>45.21%</th><th>32.46%</th></tr>
129-
</table>
128+
<th>Totals*</th><th>1028</th><th>387</th><th>791</th><th>534</th><th>2740</th><th/><th/><th/></tr>
129+
<th>Overall Results*</th><th/><th/><th/><th/><th/><th>77.67%</th><th>45.21%</th><th>32.46%</th></tr>
130+
</table><p>*-The Overall Results are averages across all the vulnerability categories. You can't compute these averages by simply calculating the TPR and FPR rates using the values in the Totals row. If you did that, categories with larger number of tests would carry more weight than categories with less tests. The proper calculation of the Overall Results is to add up all the TPR, FPR, and Score values, and then divide by the number of vulnerability categories, which is how they are calculated.<p/>
130131
<p>
131132

132133

scorecard/Benchmark_v1.2beta_Scorecard_for_FindBugs.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,9 @@ <h2>Detailed Results</h2>
125125
<tr class="danger"><td>Weak Hash Algorithm</td><td>0</td><td>129</td><td>107</td><td>0</td><td>236</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
126126
<tr class="danger"><td>Weak Random Number</td><td>0</td><td>218</td><td>275</td><td>0</td><td>493</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
127127
<tr class="danger"><td>XPath Injection</td><td>0</td><td>15</td><td>20</td><td>0</td><td>35</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
128-
<th>Totals</th><th>153</th><th>1262</th><th>1190</th><th>135</th><th>2740</th><th>5.26%</th><th>5.46%</th><th>-0.19%</th></tr>
129-
</table>
128+
<th>Totals*</th><th>153</th><th>1262</th><th>1190</th><th>135</th><th>2740</th><th/><th/><th/></tr>
129+
<th>Overall Results*</th><th/><th/><th/><th/><th/><th>5.26%</th><th>5.46%</th><th>-0.19%</th></tr>
130+
</table><p>*-The Overall Results are averages across all the vulnerability categories. You can't compute these averages by simply calculating the TPR and FPR rates using the values in the Totals row. If you did that, categories with larger number of tests would carry more weight than categories with less tests. The proper calculation of the Overall Results is to add up all the TPR, FPR, and Score values, and then divide by the number of vulnerability categories, which is how they are calculated.<p/>
130131
<p>
131132

132133

scorecard/Benchmark_v1.2beta_Scorecard_for_OWASP_ZAP.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,9 @@ <h2>Detailed Results</h2>
125125
<tr class="danger"><td>Weak Hash Algorithm</td><td>0</td><td>129</td><td>107</td><td>0</td><td>236</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
126126
<tr class="danger"><td>Weak Random Number</td><td>0</td><td>218</td><td>275</td><td>0</td><td>493</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
127127
<tr class="danger"><td>XPath Injection</td><td>0</td><td>15</td><td>20</td><td>0</td><td>35</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
128-
<th>Totals</th><th>245</th><th>1170</th><th>1324</th><th>1</th><th>2740</th><th>18.03%</th><th>0.04%</th><th>17.99%</th></tr>
129-
</table>
128+
<th>Totals*</th><th>245</th><th>1170</th><th>1324</th><th>1</th><th>2740</th><th/><th/><th/></tr>
129+
<th>Overall Results*</th><th/><th/><th/><th/><th/><th>18.03%</th><th>0.04%</th><th>17.99%</th></tr>
130+
</table><p>*-The Overall Results are averages across all the vulnerability categories. You can't compute these averages by simply calculating the TPR and FPR rates using the values in the Totals row. If you did that, categories with larger number of tests would carry more weight than categories with less tests. The proper calculation of the Overall Results is to add up all the TPR, FPR, and Score values, and then divide by the number of vulnerability categories, which is how they are calculated.<p/>
130131
<p>
131132

132133

scorecard/Benchmark_v1.2beta_Scorecard_for_PMD.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,9 @@ <h2>Detailed Results</h2>
125125
<tr class="danger"><td>Weak Hash Algorithm</td><td>0</td><td>129</td><td>107</td><td>0</td><td>236</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
126126
<tr class="danger"><td>Weak Random Number</td><td>0</td><td>218</td><td>275</td><td>0</td><td>493</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
127127
<tr class="danger"><td>XPath Injection</td><td>0</td><td>15</td><td>20</td><td>0</td><td>35</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
128-
<th>Totals</th><th>0</th><th>1415</th><th>1325</th><th>0</th><th>2740</th><th>0.00%</th><th>0.00%</th><th>0.00%</th></tr>
129-
</table>
128+
<th>Totals*</th><th>0</th><th>1415</th><th>1325</th><th>0</th><th>2740</th><th/><th/><th/></tr>
129+
<th>Overall Results*</th><th/><th/><th/><th/><th/><th>0.00%</th><th>0.00%</th><th>0.00%</th></tr>
130+
</table><p>*-The Overall Results are averages across all the vulnerability categories. You can't compute these averages by simply calculating the TPR and FPR rates using the values in the Totals row. If you did that, categories with larger number of tests would carry more weight than categories with less tests. The proper calculation of the Overall Results is to add up all the TPR, FPR, and Score values, and then divide by the number of vulnerability categories, which is how they are calculated.<p/>
130131
<p>
131132

132133

scorecard/Benchmark_v1.2beta_Scorecard_for_SonarQube_Java_Plugin.html

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,9 @@ <h2>Detailed Results</h2>
125125
<tr ><td>Weak Hash Algorithm</td><td>89</td><td>40</td><td>107</td><td>0</td><td>236</td><td>68.99%</td><td>0.00%</td><td>68.99%</td></tr>
126126
<tr class="success"><td>Weak Random Number</td><td>218</td><td>0</td><td>275</td><td>0</td><td>493</td><td>100.00%</td><td>0.00%</td><td>100.00%</td></tr>
127127
<tr class="danger"><td>XPath Injection</td><td>0</td><td>15</td><td>20</td><td>0</td><td>35</td><td>0.00%</td><td>0.00%</td><td>0.00%</td></tr>
128-
<th>Totals</th><th>607</th><th>808</th><th>1184</th><th>141</th><th>2740</th><th>50.36%</th><th>17.02%</th><th>33.34%</th></tr>
129-
</table>
128+
<th>Totals*</th><th>607</th><th>808</th><th>1184</th><th>141</th><th>2740</th><th/><th/><th/></tr>
129+
<th>Overall Results*</th><th/><th/><th/><th/><th/><th>50.36%</th><th>17.02%</th><th>33.34%</th></tr>
130+
</table><p>*-The Overall Results are averages across all the vulnerability categories. You can't compute these averages by simply calculating the TPR and FPR rates using the values in the Totals row. If you did that, categories with larger number of tests would carry more weight than categories with less tests. The proper calculation of the Overall Results is to add up all the TPR, FPR, and Score values, and then divide by the number of vulnerability categories, which is how they are calculated.<p/>
130131
<p>
131132

132133

src/main/java/org/owasp/benchmark/score/report/Report.java

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -203,13 +203,16 @@ else if (r.truePositiveRate > .7 && r.falsePositiveRate < .3)
203203
if (!Double.isNaN(r.score))
204204
totalScore += r.score;
205205
}
206-
sb.append("<th>" + "Totals" + "</th>");
206+
sb.append("<th>Totals*</th>");
207207
sb.append("<th>" + totals.tp + "</th>");
208208
sb.append("<th>" + totals.fn + "</th>");
209209
sb.append("<th>" + totals.tn + "</th>");
210210
sb.append("<th>" + totals.fp + "</th>");
211211
int total = totals.tp + totals.fn + totals.tn + totals.fp;
212212
sb.append("<th>" + total + "</th>");
213+
sb.append("<th/><th/><th/></tr>\n");
214+
215+
sb.append("<th>Overall Results*</th><th/><th/><th/><th/><th/>");
213216
double tpr = (totalTPR / scores.size());
214217
sb.append("<th>" + new DecimalFormat("#0.00%").format(tpr) + "</th>");
215218
double fpr = (totalFPR / scores.size());
@@ -218,6 +221,12 @@ else if (r.truePositiveRate > .7 && r.falsePositiveRate < .3)
218221
sb.append("<th>" + new DecimalFormat("#0.00%").format(score) + "</th>");
219222
sb.append("</tr>\n");
220223
sb.append("</table>");
224+
sb.append("<p>*-The Overall Results are averages across all the vulnerability categories. "
225+
+ " You can't compute these averages by simply calculating the TPR and FPR rates using "
226+
+ " the values in the Totals row. If you did that, categories with larger number of tests would carry "
227+
+ " more weight than categories with less tests. The proper calculation of the Overall Results is to"
228+
+ " add up all the TPR, FPR, and Score values, "
229+
+ " and then divide by the number of vulnerability categories, which is how they are calculated.<p/>");
221230

222231
return sb.toString();
223232
}

0 commit comments

Comments
 (0)