-
-
Notifications
You must be signed in to change notification settings - Fork 118
Open
Labels
Description
Describe the bug
A clear and concise description of what the bug is.
I run a the load_sbom pipeline using this input
https://www.python.org/ftp/python/3.13.9/Python-3.13.9.tgz.spdx.json
The SBOM has a lot of problems including:
- listing each and every file in relationships to embedded packages which creates an SBOM of 93,000+ lines. Not sure we can do much about that
- the lack of package type and PURLs for some generic packages, that we then report as "unknown" type, like with "pkg:unknown/[email protected]" when we could do better especially when there are download URLs and we could instead create a proper "generic" PURL
- pip vendored dependencies are reported but not that they are patched or the download URL may be misleading.... I would be surprised that Python 3.13's pip 25.2 depends on a msgpack binary wheel built for macOS.
Overall there are issues in the SPDX, but there are also issues on what we do with it.
For reference, attached are the SPDX, its export to CycloneDX from SCIO and also its re-export to SPDX from SCIO:
Python-3.13.9.tgz.spdx.json
scancodeio_sbom-round-trip_results-2025-10-21-12-59-02.cdx.json
scancodeio_sbom-round-trip_results-2025-10-21-13-44-58.spdx.json
System configuration
- Which version of ScanCode.io are you running? 35.4.0
- Are you running the app using Docker? yes
- On which OS? linux
- What inputs are you using? see above
- Which pipeline are you running? load_sbom