Conversation
f68d737 to
0665eaf
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #11517 +/- ##
============================================
+ Coverage 57.97% 58.04% +0.06%
Complexity 1730 1730
============================================
Files 349 351 +2
Lines 12993 13068 +75
Branches 1263 1292 +29
============================================
+ Hits 7533 7585 +52
- Misses 4995 5000 +5
- Partials 465 483 +18
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
23c2d34 to
10c38bb
Compare
...-managers/spdx/src/funTest/assets/projects/synthetic/spdx-project-cyclic-expected-output.yml
Outdated
Show resolved
Hide resolved
plugins/package-managers/spdx/src/main/kotlin/SpdxDocumentFile.kt
Outdated
Show resolved
Hide resolved
| private fun SpdxPackage.toIdentifier(type: String) = | ||
| Identifier( | ||
| type = projectType, | ||
| type = type, |
There was a problem hiding this comment.
Why would one only derive the type, but not any other component from the purl?
There was a problem hiding this comment.
I'm not trying to imply that we shouldn't derive other properties from the PURL, too. I was simply focusing on the type for now as I'm working on the topic of type mappings. But I agree that a follow-up to also derive other properties makes sense.
There was a problem hiding this comment.
I've implemented the (optional) mapping of the whole id from a PURL now to this PR. Please have another look.
|
|
||
| val purl = locateExternalReference(SpdxExternalReference.Type.Purl) | ||
| val artifact = getRemoteArtifact() | ||
| val purl = purlReference |
There was a problem hiding this comment.
commit: I'm uncertain about how much this change derails the uniqueness of IDs and find it hard to oversee potentially problematic areas. What comes to my mind is:
- No more possible to tell based on ID, from which package manager the package comes from.
- Data in SBOM for a package may differ from e.g. the package manager package denoted by the same ID. Do we want curations crafted for one being applied to the other.
- Would handling of duplicates be sufficient? In particular, how about multiple analyzed SBOMs mentioning the same purl with conflicting information? (Such cases IIUC cannot be solved with deriving the ID from the purl)
There was a problem hiding this comment.
No more possible to tell based on ID, from which package manager the package comes from.
I'm not sure I can follow the argumentation here. Are you saying that previously a Maven package described in SPDX had an ORT if of SpdxDocumentFile:maven-group:maven-artifact:maven-version, so you could tell from this id alone that this Maven package was "managed" by SPDX?
If so, I believe that's something you should not rely on anyway, as for example Maven packages manager by Gradle would also have a "Maven" id, not a "Gradle" id. So generally, you must not judge from the package type to the package manager. Instead, the package manager is reflected in the project type. So if you'd wanted to know if a Maven package was originally declared via SPDX (instead of e.g. Gradle), when you need to look at the type of the project that contains the package.
Data in SBOM for a package may differ from e.g. the package manager package denoted by the same ID.
My ultimate / ideal goal would be to have unique ids for the same packages, no matter via which package manager or build system they're being referred to. So a Maven package should always be of type "Maven", no matter whether it's being used via Maven (the build system), Gradle, or SPDX / CycloneDX.
This actually is a step toward being able to use PURLs (which also uniquely describe a package)) as primary identifiers for packages, if we ever manage to do that.
Do we want curations crafted for one being applied to the other.
Always the unique / canonical id for a package should be used in curations.
how about multiple analyzed SBOMs mentioning the same purl with conflicting information?
What kind of conflicting information exactly, as part of the PURL? Could you give a concrete example?
In general, if a user uses the same PURL for actually different packages, I'd consider this to be a user error. ("garbage in ➡️ garbage out")
There was a problem hiding this comment.
I'm not sure I can follow the argumentation here.
I did not intend to make an argumentation, in the form of something should be done in a certain way.
My intention was, to share some thought for a discussion about whether this approach works or if we can find some pitfalls. Also whether it is desireable to derive the IDs from the purls in case of the SBOM package manager. Maybe it is obvious to you and everybody else, to me it isn't, at least not 100%.
What kind of conflicting information exactly, as part of the PURL?
Any kind of metadata, say properties of an SPDX package.
This allows for some simplifications. Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
10c38bb to
b84683e
Compare
Split extension functions to separate files and move top-level functions to below the class. This way "TooManyFunctions" is not exceeded anymore. This requires to parameterize `toIdentifier()` as it does not have access to `projectType` anymore, but that change is anyway required by follow-up changes. Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
b84683e to
59c1f7a
Compare
Add an option to deduce the ORT id of a project / package from a PURL, if present. Enabling this allows to maintain e.g. package identifiers during round-trip conversions. The feature is disabled by default to maintain backward compatibility. Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
59c1f7a to
44a9561
Compare
Please have a look at the individual commit messages for the details.