AbstractPackage (was: Software Component)#1044
Conversation
Signed-off-by: Alexios Zavras (zvr) <zvr+git@zvr.gr>
Signed-off-by: Alexios Zavras (zvr) <zvr+git@zvr.gr>
Signed-off-by: Alexios Zavras (zvr) <zvr+git@zvr.gr>
goneall
left a comment
There was a problem hiding this comment.
I was expecting to see several properties in the Component class that would be shared with the Packages which were instances of the Component. I'll need a bit more context on how this would work in practice - I'll go back and re-look at the presentation referenced.
|
I noticed there is an |
In looking back at the presentation, most of the examples are relationships (e.g. license information), the exception being the If we do add these as properties, we should specify the precedence if both the |
|
Thanks for the comments, @goneall. I'll try to answer everything in a single comment.
Yes, this is needed, because a This is even more important in case where some of the meta-information changes according to the version. The typical example is OpenSSL, where all versions 1.x are under the OpenSSL license, while all versions 3.x are under Apache-2.0. Let me upload a couple of diagrams I have, showing examples. They show different classes by shapes and different relationships by arrow line style.
Ah, you are correct. I was under the impression that everything was relationships; I completely forgot that we ended up with properties for the texts. And I did not see them in the
No, I don't think this would be good idea, since the whole point of the
You are absolutely correct. The precedence rule is that every attribute of a more specific entity overwrites attribute values of a more general entity. This way, property values of a |
|
Here are a couple of example diagrams. They show different classes by shapes and different relationships by arrow line style. They also illustrate two different approaches, one using a single The first one is about Curl: two versions, released by two suppliers (so four The diagram shows that the licensing information, for example, can be related to the abstract Looking at OpenSSL, the situation is more complicated, since the license is not the same for all versions. So, two more In this example, only the There are pros and cons for both approaches (one or two Apologies for re-using old diagrams here due to lack of time. |
|
Thanks @zvr for the diagrams and descriptions - I have no more questions and the proposal makes sense to me. The challenge will be documenting this in a way readers of the spec can easily understand. Perhaps we can add some more text about the expected relationship types in the |
|
Thanks, @goneall . And now that I think about it, we might add some more text that this is not typically expected to be in SBOMs. SBOMs are always about specific Unless your SBOM contains many "Curl" packages, for example, it is not to your advantage to complicate things. |
Signed-off-by: Alexios Zavras (zvr) <zvr+git@zvr.gr>
Signed-off-by: Alexios Zavras (zvr) <zvr+git@zvr.gr>
|
From the 5 August 2025 tech call, there were 3 high level issues to be followed-up on:
|
Three possible solutions were raised during the 5 August, 2025 tech call
|
|
From tech call on 2 Sept 2025:
|
|
My only thought is that class name of Component might be better off as SoftwareComponent. Even is it redundant to have Software is will to easier to see on model image and understanding. |
|
@stevenc-stb I also find it redundant, but from the discussion today it seems that other areas refer to components. I'll change it -- this will not be the only case where we have redundancy in naming... |
Signed-off-by: Alexios Zavras (zvr) <github@zvr.gr>
|
@JPEWdev you are correct that, because our Relationships have one However, since our Elements (and therefore the Relationships) are immutable, think of what will happen in a typical flow:
If the links go upwards, you would need to add a new If the links go downwards, you know you have to create a new Relationship from the new curl version to the existing "curl". In both ways ((b) in the upwards arrows case or in downwards arrows) you end up with a new Relationship with a single Does this make sense? @goneall , your thoughts? |
Yes. And that is exactly our use case. We (yocto) would never actually use the "fan-out" of having multiple |
|
And also there are plenty of examples of the relationship not necessarily matching the "conceptual" direction of the arrow |
|
OK, I don't mind reversing the direction of the Relationship. As I wrote above, we already have the RelationshipType for "curl 8.9.1" — How should we name the RelationshipType name for "curl" — (if we were talking about arbitrary groupings, we might even say |
Signed-off-by: Alexios Zavras (zvr) <github@zvr.gr>
|
@zvr I don't think I have a particular concern about the idea of having "Component" as a generic concept of a collection of non-versioned / undifferentiated related Packages. That said, I guess I'm not clear about which Package-related fields or relationships would be appropriate to link to Components. Looking at licenses (whether declared or concluded) and using OpenSSL as you mentioned, for example: I'm not sure it really makes sense to have metadata saying "what is the license of the general concept of OpenSSL?" Depending on which specific version of OpenSSL you mean, it could be OpenSSL or it could be Apache-2.0. Given that, I would recommend that non-versioned, non-specific Components should not be used in connection with the declared license / concluded license values (again, assuming I'm understanding it correctly). |
|
If we want to have a "has license" relationship between SoftwareComponent and a license,
Currently the proposed SoftwareComponent is a subclass of Element. |
|
Ah, but @swinslow did you see the second diagram I've put on the comment above ? As you say, there is no way to associate a single license with "OpenSSL". But one can associate the As we know, license changes on version changes do not happen very frequently, so the general case (SoftwareComponent "Curl" associated with the |
|
@bact you are correct; the descriptions of these RelationshipTypes will have to be updated. |
|
@zvr - Can you update the descriptions per your above comment? |
|
@zvr - can you help me understand when we should be using component vs. package in SBOMs? Possibly the package definition should be updated to make it clear if you want to go forward with this. |
|
@kestewart the main use case is not in SBOMs; in this case, in a specific software release you have a specific But in the (graph) data that you keep, for all your software, you have many Specifically for SBOMs, @JPEWdev mentioned in a tech call that this would save a lot of space in theirs (IIRC). |
|
@JPEWdev - have your comments been addressed? do you approve? |
|
ping @JPDEWdev, you good with this now? Alexios, I'd rather this go into 3.1-rc2 after we have have more eyes on the differences between Package and Component. I'm still worried we'll have people getting confused. I'm also wanting to make sure that we have considered how the Hardware Components (as well as the other profiles) and how they should be interacting. Hardware Component is a common term, and restricting it to just software could become problematic. |
Signed-off-by: Arthit Suriyawongkul <arthit@gmail.com>
|
During today's spdx-tech meeting @kestewart raised this PR for awareness. There was some preliminary discussion around this concept of "component" conflicting with definitions and concepts in existing standards. Here are some definitions that SPDX must contend with: From SEVOCAB standards search for 'component':
It seems to me the definitions above equate "component" to "composable unit". These units will always have versions with them, have specified interfaces to them, and are verifiable. On a quick read over the top-level comments of this PR it seems like the goal is to support extracting/refactoring subsets of the SPDX information (data, metadata, etc.) to reduce the size of the SPDX file. It is a worthwhile goal that applies to HW as well as SW. Please notice how many times a resistor (electrical component) of the same specification appears on a circuit board. ;-) AFAICT, we need to:
Perhaps this only requires using a different term? |
I disagree that they "will always have versions". As the definition that was pasted says, they are to be considered at a particular level of analysis. There are definitely use cases to analyze systems without caring about the exact version of all the software pieces (license analysis being the most common one). |
That's an interesting observation. Please consider that an analysis like that covers a range of implementations (or rather, a set of versions) that may be valid for integration. I believe each implementation will have a specific version (even if it is only identified by a SHA). This unique identification is part of the information that must be included in the SBOM. |
|
@gregshue I completely agree. The proposed |
| The chain may continue further to more AbstractPackages, | ||
| as long as there are "parent" AbstractPackage and no values have been specified. | ||
|
|
||
| Every Package should be an instance of no more than one AbstractPackages. |
There was a problem hiding this comment.
| Every Package should be an instance of no more than one AbstractPackages. | |
| Every Package shall be an instance of no more than one AbstractPackage. |
If we want this to be a requirement, use "shall".


This introduces the idea of "Software Components", an abstract view of pieces of software.
The existing
Packageclass records information about specific software packages, such as "OpenSSL v3.0.1 distributed by Ubuntu" or "OpenSSL v3.1.1 distributed by Debian." However, when storing data, this approach can lead to redundancy and inefficiencies, particularly when dealing with licensing information and other metadata that is common across multiple versions and distributions of the same software. This PR introduces the concept of aComponentas an abstract reference to a piece of software, distinct from aPackage, which as mentioned before represents a specific version of the software distributed by a particular supplier.By adding this distinction, there is now a way to record relationships between different parts of the software ecosystem. For example, both "OpenSSL v3.0.1 distributed by Ubuntu" and "OpenSSL v3.1.1 distributed by Debian"
Packagescan be linked to the abstractComponent"OpenSSL". This relationship-based approach not only enhances the clarity and organization of SPDX data but also leads to significant storage savings. Common information, such as the licensing terms of a component, can be stored once and referenced across multiple packages, eliminating redundancy.For more information and some real-world numbers on the efficiency gains, one can see a presentation in this year's FOSDEM SBOM devroom.
This PR adds a new class named
Componentin the Software profile and a newRelationshipTypeto be used for expressing these relationships. No new properties are added; the new class re-uses some properties already present.