Skip to content

SPDX package type fixes#11517

Open
sschuberth wants to merge 3 commits intomainfrom
spdx-package-type
Open

SPDX package type fixes#11517
sschuberth wants to merge 3 commits intomainfrom
spdx-package-type

Conversation

@sschuberth
Copy link
Member

Please have a look at the individual commit messages for the details.

@sschuberth sschuberth requested a review from a team as a code owner March 3, 2026 15:03
@sschuberth sschuberth enabled auto-merge (rebase) March 3, 2026 15:03
@codecov
Copy link

codecov bot commented Mar 3, 2026

Codecov Report

❌ Patch coverage is 69.33333% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.04%. Comparing base (6159101) to head (44a9561).

Files with missing lines Patch % Lines
...agers/spdx/src/main/kotlin/utils/SpdxExtensions.kt 69.69% 4 Missing and 16 partials ⚠️
...ers/spdx/src/main/kotlin/utils/StringExtensions.kt 66.66% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11517      +/-   ##
============================================
+ Coverage     57.97%   58.04%   +0.06%     
  Complexity     1730     1730              
============================================
  Files           349      351       +2     
  Lines         12993    13068      +75     
  Branches       1263     1292      +29     
============================================
+ Hits           7533     7585      +52     
- Misses         4995     5000       +5     
- Partials        465      483      +18     
Flag Coverage Δ
funTest-external-tools 14.58% <56.00%> (+0.39%) ⬆️
funTest-no-external-tools 30.57% <0.00%> (-0.30%) ⬇️
test-ubuntu-24.04 42.38% <22.66%> (-0.17%) ⬇️
test-windows-2025 42.36% <22.66%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sschuberth sschuberth force-pushed the spdx-package-type branch 2 times, most recently from 23c2d34 to 10c38bb Compare March 3, 2026 16:37
@sschuberth sschuberth changed the title Spdx package type fixes SPDX package type fixes Mar 3, 2026
private fun SpdxPackage.toIdentifier(type: String) =
Identifier(
type = projectType,
type = type,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would one only derive the type, but not any other component from the purl?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trying to imply that we shouldn't derive other properties from the PURL, too. I was simply focusing on the type for now as I'm working on the topic of type mappings. But I agree that a follow-up to also derive other properties makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented the (optional) mapping of the whole id from a PURL now to this PR. Please have another look.


val purl = locateExternalReference(SpdxExternalReference.Type.Purl)
val artifact = getRemoteArtifact()
val purl = purlReference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit: I'm uncertain about how much this change derails the uniqueness of IDs and find it hard to oversee potentially problematic areas. What comes to my mind is:

  1. No more possible to tell based on ID, from which package manager the package comes from.
  2. Data in SBOM for a package may differ from e.g. the package manager package denoted by the same ID. Do we want curations crafted for one being applied to the other.
  3. Would handling of duplicates be sufficient? In particular, how about multiple analyzed SBOMs mentioning the same purl with conflicting information? (Such cases IIUC cannot be solved with deriving the ID from the purl)

Copy link
Member Author

@sschuberth sschuberth Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more possible to tell based on ID, from which package manager the package comes from.

I'm not sure I can follow the argumentation here. Are you saying that previously a Maven package described in SPDX had an ORT if of SpdxDocumentFile:maven-group:maven-artifact:maven-version, so you could tell from this id alone that this Maven package was "managed" by SPDX?

If so, I believe that's something you should not rely on anyway, as for example Maven packages manager by Gradle would also have a "Maven" id, not a "Gradle" id. So generally, you must not judge from the package type to the package manager. Instead, the package manager is reflected in the project type. So if you'd wanted to know if a Maven package was originally declared via SPDX (instead of e.g. Gradle), when you need to look at the type of the project that contains the package.

Data in SBOM for a package may differ from e.g. the package manager package denoted by the same ID.

My ultimate / ideal goal would be to have unique ids for the same packages, no matter via which package manager or build system they're being referred to. So a Maven package should always be of type "Maven", no matter whether it's being used via Maven (the build system), Gradle, or SPDX / CycloneDX.

This actually is a step toward being able to use PURLs (which also uniquely describe a package)) as primary identifiers for packages, if we ever manage to do that.

Do we want curations crafted for one being applied to the other.

Always the unique / canonical id for a package should be used in curations.

how about multiple analyzed SBOMs mentioning the same purl with conflicting information?

What kind of conflicting information exactly, as part of the PURL? Could you give a concrete example?

In general, if a user uses the same PURL for actually different packages, I'd consider this to be a user error. ("garbage in ➡️ garbage out")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can follow the argumentation here.

I did not intend to make an argumentation, in the form of something should be done in a certain way.
My intention was, to share some thought for a discussion about whether this approach works or if we can find some pitfalls. Also whether it is desireable to derive the IDs from the purls in case of the SBOM package manager. Maybe it is obvious to you and everybody else, to me it isn't, at least not 100%.

What kind of conflicting information exactly, as part of the PURL?

Any kind of metadata, say properties of an SPDX package.

@sschuberth sschuberth requested review from a team and fviernau March 4, 2026 13:08
This allows for some simplifications.

Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
Split extension functions to separate files and move top-level functions
to below the class. This way "TooManyFunctions" is not exceeded anymore.

This requires to parameterize `toIdentifier()` as it does not have
access to `projectType` anymore, but that change is anyway required by
follow-up changes.

Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
Add an option to deduce the ORT id of a project / package from a PURL,
if present. Enabling this allows to maintain e.g. package identifiers
during round-trip conversions. The feature is disabled by default to
maintain backward compatibility.

Signed-off-by: Sebastian Schuberth <sebastian@doubleopen.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants