Skip to content

Commit ac85b85

Browse files
rfd: support CNAs reporting affected artifacts
This RFD introduces support for CNAs to report artifacts affected by a vulnerability by introducing a new "affectedArtifacts" field to the "cnaPublishedContainer". This new field is an array of objects, with each object identifying a single artifact, potentially with multiple identifiers per-artifact. Signed-off-by: Andrew Lilley Brinker <[email protected]>
1 parent 27044cd commit ac85b85

File tree

1 file changed

+366
-0
lines changed

1 file changed

+366
-0
lines changed
Lines changed: 366 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,366 @@
1+
# Reporting Affected Artifacts in CVE
2+
3+
| Field | Value |
4+
|:-----------------|:-------|
5+
| RFD Submitter | Andrew Lilley Brinker |
6+
| RFD Pull Request | [RFD #0000](#) |
7+
8+
## Summary
9+
[summary]: #summary
10+
11+
Today, CVE supports identifying affected products or packages using three
12+
"identifier-like" constructs, with one more proposed in RFD #2, "Supporting
13+
Package URLs in CVE". They are:
14+
15+
- CPE, Common Platform Enumeration
16+
- Vendor and product names, provided as a pair
17+
- Collection URL and package names, provided as a pair
18+
- (If accepted) Package URLs, also called "purls"
19+
20+
While these coarse-grained identifiers are great for identifying affected
21+
products or packages, they are insufficiently granular for identifying
22+
_affected artifacts_. This makes it difficult for CNAs to report fine-grained
23+
applicability information when they otherwise could.
24+
25+
For example, a CNA may know that specific binaries they build and ship to users
26+
are affected by a vulnerability. Today, there is not a clear, structured
27+
mechanism for reporting identifiers for these affected binaries to CVE
28+
consumers.
29+
30+
This RFD proposes introducing support for reporting affected artifacts, by
31+
adding a new optional `affectedArtifacts` field to `containers.cna`, which
32+
would contain an array of objects specifying identifiers for artifacts affected
33+
by a vulnerability.
34+
35+
## Problem Statement
36+
[problem-statement]: #problem-statement
37+
38+
While CVE records today can contain substantial information about affected
39+
products or packages, there isn't a clear and structured way to report
40+
information about specific artifacts affected or not affected by a
41+
vulnerability.
42+
43+
This deficiency means CNAs who publish artifacts—such as prebuilt binaries,
44+
archive files such as `.zip`s or `.tar.gz`s, script files, or configuration
45+
files—lack a means to communicate when those artifacts are known to be
46+
vulnerable or to not be vulnerable.
47+
48+
For vulnerability managers, reacting to vulnerability disclosures with
49+
coarse-grained identifiers for affected software requires maintaining accurate
50+
software inventories, whether through Software Bills of Material, package
51+
manifests (such as `package.json` or `Cargo.toml`), lockfiles (such as
52+
`package-lock.json` or `Cargo.lock`), or other means. Without some method for
53+
tracking what software is deployed in a production system, vulnerability
54+
managers may struggle to turn identifiers provided in a CVE record into a clear
55+
determination of applicability, and therefore also to respond quickly to
56+
vulnerabilities when they're disclosed. Reducing the time-to-react for
57+
vulnerability managers is a clear equity for the CVE program.
58+
59+
## Proposed Solution
60+
[proposed-solution]: #proposed-solution
61+
62+
The presence of artifact identifiers in CVE Records would provide an additional
63+
mechanism to vulnerability managers to identify applicable vulnerabilities.
64+
For example, a hash of a known-vulnerable binary could be searched for on
65+
production systems in addition to any deployed software inventories.
66+
67+
Artifact identifiers also have the benefit of low false-positive matches.
68+
Coarse-grained identifiers for products or packages may be decomposed further
69+
with additional fields for objects in the `affected` array, such as `platforms`,
70+
`versions`, `programFiles`, `programModules`, and more. These fields, and the
71+
potential for ambiguity or complexity for checking in many of them, mean that
72+
coarse-grained identifiers' applicability decisions can easily become complex
73+
and require human intervention to assess, and even remain uncertain _despite_
74+
human intervention.
75+
76+
By comparison, identifiers for affected artifacts, which are often based on
77+
hashing file contents, are unlikely to produce false positives. The nature of
78+
cryptographic hashing algorithms is that they are generally resistant to
79+
engineering collisions, with properties such as collision resistance, preimage
80+
resistance, and second-preimage resistance. The result of these properties is
81+
that if a vulnerability manager finds a file in their system whose artifact
82+
identifier matches an artifact identifier provided in a CVE Record, that
83+
manager can act quickly with high confidence that the match is correct.
84+
85+
Artifact identifiers have an additional benefit, because of their low false
86+
positive rate and content-based construction, of being easy to automate and
87+
check at scale.
88+
89+
The following is the actual proposed change for the Record Format:
90+
91+
### Add an `affectedArtifacts` field
92+
93+
Add an `affectedArtifacts` field to the `cnaPublishedContainer` object, found
94+
at the path `containers.cna` within a CVE Record. This new field would be an
95+
array containing `affectedArtifact` objects. The specific edits to the schema
96+
would be as follows:
97+
98+
First, the introduction of the `affectedArtifacts` field within the
99+
`cnaPublishedContainer` object:
100+
101+
```json
102+
"affectedArtifacts": {
103+
"type": "array",
104+
"description": "List of affected artifacts.",
105+
"minItems": 1,
106+
"items": {"$ref": "#/definitions/affectedArtifact"}
107+
}
108+
```
109+
110+
Second, the definition of the `affectedArtifact` type within the "definitions"
111+
portion of the schema:
112+
113+
```json
114+
"affectedArtifact": {
115+
"type": "object",
116+
"description": "Provides information about a specific artifact affected by a vulnerability.",
117+
"allOf": [
118+
{
119+
"description": "An identifier-like field, to identify the artifact.",
120+
"anyOf": [
121+
{"required": ["omniborArtifactID", "omniborArtifactType"]},
122+
{"required": ["sha256"]}
123+
]
124+
},
125+
{
126+
"description": "The status of the artifact.",
127+
"anyOf": [
128+
{"required": ["status"]}
129+
]
130+
}
131+
],
132+
"properties": {
133+
"omniborArtifactID": {
134+
"type": "string",
135+
"pattern": "^gitoid:blob:sha256:[0-9a-f]{64}$",
136+
"description": "The OmniBOR Artifact ID of the artifact to be matched against.",
137+
"examples": [
138+
"gitoid:blob:sha256:9f64df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
139+
"gitoid:blob:sha256:09c825ac02df9150e4f93d12ba1da5d1ff5846c3e62503c814aa3a300c535772",
140+
"gitoid:blob:sha256:230f3515d1306690815bd9c3da0d15d8b6fcf43894d17100eb44b6d329a92f61"
141+
]
142+
},
143+
"omniborArtifactType": {
144+
"type": "string",
145+
"enum": ["artifact", "buildInput"],
146+
"description": "Specifies how consumers of the Artifact ID should search for matches. If the 'target' is 'artifact', then the Artifact ID is identifying an artifact which should be searched for directly (for example, within a file system by matching against Artifact IDs for files). If the 'target' is 'buildInput' then the Artifact ID is identifying a build input, and consumers should match the Artifact ID against IDs found in OmniBOR Input Manifests for their software."
147+
},
148+
"sha256": {
149+
"type": "string",
150+
"pattern": "^[a-f0-9]{64}$",
151+
"description": "The SHA-256 hash of the artifact.",
152+
"examples": [
153+
"68e656b251e67e8358bef8483ab0d51c6619f3e7a1a9f0e75838d41ff368f728",
154+
"2cc620f8a156b986806bc2757c0572d978d8cbfc4d25f0dfa7c552291bf68279",
155+
"97272dc1b6ac7ca84735b797b4a04233b17fd55707f9c728fc3747e3f935f02c"
156+
]
157+
},
158+
"status": {
159+
"description": "The vulnerability status for the version or range of versions. For a range, the status may be refined by the 'changes' list.",
160+
"$ref": "#/definitions/status"
161+
},
162+
"version": {
163+
"description": "The single version being described, or the version at the start of the range. By convention, typically 0 denotes the earliest possible version.",
164+
"$ref": "#/definitions/version"
165+
},
166+
"versionType": {
167+
"type": "string",
168+
"description": "The version numbering system used for specifying the range. This defines the exact semantics of the comparison (less-than) operation on versions, which is required to understand the range itself. 'Custom' indicates that the version type is unspecified and should be avoided whenever possible. It is included primarily for use in conversion of older data files.",
169+
"minLength": 1,
170+
"maxLength": 128,
171+
"examples": [
172+
"custom",
173+
"git",
174+
"maven",
175+
"python",
176+
"rpm",
177+
"semver"
178+
]
179+
},
180+
"platforms": {
181+
"description": "List of specific platforms if the vulnerability is only relevant in the context of these platforms (optional). Platforms may include execution environments, operating systems, virtualization technologies, hardware models, or computing architectures. The lack of this field implies that the other fields are applicable to all relevant platforms.",
182+
"type": "array",
183+
"minItems": 1,
184+
"uniqueItems": true,
185+
"items": {
186+
"type": "string",
187+
"examples": ["iOS", "Android", "Windows", "macOS", "x86", "ARM", "64 bit", "Big Endian", "iPad", "Chromebook", "Docker", "Model T"],
188+
"maxLength": 1024
189+
}
190+
}
191+
}
192+
}
193+
```
194+
195+
The explanations of the fields for `affectedArtifact` objects is as follows:
196+
197+
- `omniborArtifactID`: An OmniBOR Artifact Identifier, used to identify either
198+
an artifact itself, such as a binary file, or to identify build inputs used to
199+
produce the artifact.
200+
- `omniborArtifactType`: The type associated with the `omniborArtifactID` field,
201+
can be either `"artifact"` or `"buildInput"`. If `"artifact"` is used, then
202+
the field is the Artifact ID of an artifact itself, such as a binary file. If
203+
`"buildInput"` is used, then the field is the Artifact ID of a build input.
204+
This field indicates to CVE consumers how to use the field in question. For
205+
artifacts, they should search their systems and/or inventories for files with
206+
a matching Artifact ID. For build inputs, they should search their OmniBOR
207+
Input Manifests for IDs which match.
208+
- `sha256`: The SHA-256 hash of the artifact in question.
209+
- `status`: Indicates whether the identified artifact is affected, not affected,
210+
or has an unknown affected status.
211+
- `version`: The version applicable to the identified artifact, if relevant.
212+
- `versionType`: If `"version"` is used, this indicates what type of version
213+
is present, and should be used by CVE consumers to validate and interpret the
214+
`"version"` field.
215+
- `platforms`: A list of platforms, describing the specific platform the
216+
identified artifact is intended for.
217+
218+
Additionally, the data constraints on the `affectedArtifact` object ensure that
219+
at least one set of identifier-like fields is present per object, and that each
220+
object always includes a `"status"` field.
221+
222+
Note that identifiers found in the same `affectedArtifact` object should be
223+
interpreted as synonyms, identifying the same artifact. For example, an entry
224+
in the `affectedArtifacts` array which contains both an `omniborArtifactID`
225+
and `sha256` value should be interpreted as identifying only _one artifact_,
226+
for which either identifier is valid. The presence of multiple identifiers is
227+
intended only to make matching easier for CVE consumers by providing them with
228+
options which may be more convenient depending on what identifiers or tooling
229+
the consumer has available in their systems to support matching.
230+
231+
### Use of this as a template for future identifiers
232+
233+
This proposal is intended as a template for the introduction of more
234+
fine-grained identifier types intended for identifying artifacts in the future.
235+
Specifically, future identifiers should be added as new fields within the
236+
`affectedArtifact` object inside the `affectedArtifacts` array.
237+
238+
### Vendoring of the relevant specifications
239+
240+
To ensure consistency about new identifier types added, the CVE project
241+
should "vendor," meaning maintain its own public copy of, any relevant
242+
specifications when those specifications are not versioned upstream.
243+
244+
## Examples
245+
[examples]: #examples
246+
247+
The following is an example `affectedArtifacts` field, identifying three
248+
binaries, one for each of Windows, macOS, and Linux systems on x86:
249+
250+
```json
251+
"affectedArtifacts": [
252+
{
253+
"omniborArtifactID": "gitoid:blob:sha256:9f64df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
254+
"omniborArtifactType": "artifact",
255+
"sha256": "2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824",
256+
"status": "affected",
257+
"version": "0.18.1",
258+
"versionType": "semver",
259+
"platforms": ["macOS", "x86"]
260+
},
261+
{
262+
"omniborArtifactID": "gitoid:blob:sha256:4043df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
263+
"omniborArtifactType": "artifact",
264+
"sha256": "40414dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824",
265+
"status": "affected",
266+
"version": "0.18.1",
267+
"versionType": "semver",
268+
"platforms": ["Windows", "x86"]
269+
},
270+
{
271+
"omniborArtifactID": "gitoid:blob:sha256:ccc4df92367881be21e23567a31a8ce01994d98b69d28917b5c132ce32a8e6c8",
272+
"omniborArtifactType": "artifact",
273+
"sha256": "ddd24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824",
274+
"status": "affected",
275+
"version": "0.18.1",
276+
"versionType": "semver",
277+
"platforms": ["Linux", "x86"]
278+
}
279+
]
280+
```
281+
282+
## Impact Assessment
283+
[impact-assessment]: #impact-assessment
284+
285+
The addition of this new field would enable CNAs to report affected artifacts,
286+
such as known-vulnerable prebuilt binaries shipped for versions of software
287+
affected by a vulnerability, and would be complementary to the existing ability
288+
in the Record Format to identify affected products and packages.
289+
290+
For CVE consumers, the addition of this field would provide the ability to
291+
search for the presence of known-vulnerable artifacts in their systems when
292+
reported by CNAs.
293+
294+
## Compatibility and Migration
295+
[compatibility-and-migration]: #compatibility-and-migration
296+
297+
This would be a minor change, as the addition of new optional fields is
298+
considered non-breaking.
299+
300+
CVE consumers could, if they wanted, gain the benefit of the new field by
301+
updating their consumption logic to recognize the field and make use of its
302+
contents. CVE consumers would only be broken if they incorrectly assume in their
303+
consumption logic that no new optional fields will ever be added to the
304+
`cnaContainer` object.
305+
306+
## Success Metrics
307+
[success-metrics]: #success-metrics
308+
309+
The success of this proposal will depend on the adoption of the new field,
310+
and the degree to which the new field provides value for CVE consumers.
311+
312+
CNA adoption can be measured in reported CVEs. After a 6 month period from the
313+
publication of the first version to include the new field, the QWG must assess
314+
the prevalence of the new field in CVEs published in the past 6 months. If the
315+
new field is present in 5% of new CVEs, this RFD will be considered successful
316+
and the new field will not be rolled back.
317+
318+
CVE may consider making inclusion of affected artifacts a requirement for CNA
319+
recognition with the Enrichment Recognition List.
320+
321+
Measuring use by CVE consumers is a significantly larger challenge. A potential
322+
path would be to interview vulnerability management tool vendors, since many of
323+
these ingest and process the CVE list. Enquiring as to the role affected
324+
artifacts play in their processes would provide a strong indication of the value
325+
these identifiers provide. Of course, it will take vendors some time to adjust
326+
their processes. As such, the measure might be to look for at least two vendors
327+
using the new software identifier formats within a year of the adoption of the
328+
new formats.
329+
330+
## Supporting Data or Research
331+
[supporting-data-or-research]: #supporting-data-or-research
332+
333+
Demand for OmniBOR was identified specifically in the most recent CVE user
334+
survey, with positive demand shown in Question 16, with the strongest demand
335+
shown from self-identified data aggregators and integrators.
336+
337+
More generally, demand for identifying affected artifacts in CVE is unclear.
338+
Beyond the question future priorities which included OmniBOR, there were no
339+
specific questions in the survey around demand for identifying affected
340+
artifacts.
341+
342+
That said, this lack of support has been identified as a gap in discussions
343+
among the QWG, and there is interest in addressing it, whether through this
344+
proposal or a future alternative proposal.
345+
346+
## Related Issues or Proposals
347+
[related-issues-or-proposals]: #related-issues-or-proposals
348+
349+
None identified.
350+
351+
## Recommended Priority
352+
[recommended-priority]: #recommended-priority
353+
354+
Medium
355+
356+
## Unresolved Questions
357+
[unresolved-questions]: #unresolved-questions
358+
359+
There are no remaining unresolved questions.
360+
361+
## Future Possibilities
362+
[future-possibilities]: #future-possibilities
363+
364+
More identifier types may be desirable to add in the future. Any question of
365+
what those types may be, or what they may look like within the CVE Record
366+
Format, is not addressed here.

0 commit comments

Comments
 (0)