-
-
Notifications
You must be signed in to change notification settings - Fork 81
Description
Assessments results on discrepancy of SBOM ecosystem and some suggestions
Background
As SBOM can be widely used in software software chain management, the capability and issues within SBOM ecosystem can influence the employment of users, thus accurately assessments of the current SBOM state is important. To this end, we have conducted a series of assessments on key characteristics in SBOM applications to reveal the potential discrepancies hindering usage.
Questions
We asked 3 questions:
1. Compliance: Do SBOM tools generate outputs that adhere to user requirements and standards?
2. Consistency: Do SBOM tools maintain consistency in transforming the produced SBOM?
3. Accuracy: How accurate are the SBOM produced by tools in reflecting the objective software?
Upon 9970 SBOM documents generated from 6 SBOM tools (sbom-tool, ort, syft, gh-sbom, cdxgen and scancode) in both SPDX and CycloneDX on 1162 GitHub repositories, we assess these questions.
Results
This table shows average results across all the 6 tools, results are all in package level. Note that in the results for information of software itself is quite poor, for instance, we have 89.59% repositories contain licenses while only a minority are identified.
| Attr. | pkg_name | version | author | purl | license | copyright |
|---|---|---|---|---|---|---|
| Compliance | 79.61% | 74.99% | 17.84% | 67.53% | 32.34% | 14.17% |
| Consistency | 18.44% | 22.24% | 0.11% | 24.99% | 2.12% | - |
| Accuracy | 25.81% | 10.66% | 4.94% | - | 10.66% | - |
The findings indicate that while SBOM tools 100% support mandatory standards requirements, their performance in user case support is at 49.37% and the consistency within these supported use cases is on average of 17.63%. Accuracy assessments reveal significant discrepancies, with accuracy rates of 8.62%, 25.81%, and 12.3% across three defined layers, underscoring substantial areas for improvement within the SBOM ecosystem.
Suggestions
- In component sections, some tools record the package
namewith their information sources like pip, maven, npm, etc., while others do not. Inversiontools varing in recording like whether add a 'V' before theversionstring this will lead to problems in utilizing SBOM from different SBOM tools. We suggest to require tools to specify their pattern in recoring information without the standard's explicit specification. - The meaning of
NOASSERTION,NONEandNonecould be confusing in specific data fields. For instance,versioncan naturally be empty in packages as the developers didn't record them in the software, tools deal empty ones into empty string or the three forms, which could lead to inconsistency for further exchange. We suggest to provide specific marks for these natually empty data fields. - For hashes, we found that in different tools that using the same hash algorithm on the same single file have different checksums in SPDX, there is even no consistent checksums across all the software and packages. While in CycloneDX, the
hasheseven does not specify the object the hash is performed on. We suggest to demand tools in creating checksums explicitly illustrate their process for creating the checksums, e.g. salt value or other preprocessing.
We hope our findings can help promote the SBOM ecosystem, any questions or discussions are welcomed.