Skip to content

Commit 120a0ec

Browse files
committed
First Draft of the OpenVEX Spec
This commit checks into the the repository the first draft of the OpenVEX specification. Markdown links and some diagrams are missing but will be completed after initial round of reviews. Signed-off-by: Adolfo García Veytia (Puerco) <[email protected]>
1 parent bd0f3ca commit 120a0ec

File tree

1 file changed

+361
-0
lines changed

1 file changed

+361
-0
lines changed

OPENVEX-SPEC.md

Lines changed: 361 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,361 @@
1+
# OpenVEX Specification v0.0.0
2+
3+
## Overview
4+
5+
OpenVEX is an implementation of Vulnerability Exploitability eXchange (VEX)
6+
designed to be lightweight, and embeddable while meeting all requirements of
7+
a valid VEX implementation as defined in the [Minimum Requirements for Vulnerability
8+
Exploitability eXchange (VEX)](http://example.com) document published on XXX
9+
by the VEX working group coordinated by the [Cybersecurity & Infrastructure
10+
Security Agency](https://www.cisa.gov/) (CISA).
11+
12+
## About VEX
13+
14+
Vulnerability Exploitability eXchange is a vulnerability document designed to complement a Software Bill of Materials (SBOM) that informs users of a software product of the status of the impact of a vulnerability.
15+
16+
Security scanners often will detect and flag components in software that have
17+
been identified as being vulnerable. Often, software is not necessarily affected
18+
as signaled by security scanners for many reasons: the vulnerable component may
19+
have been already patched, not present, or is simply never executed. To turn off
20+
false alerts like these, a scanner may consume VEX data from the software supplier.
21+
22+
The extreme transparency brought by SBOMs into how software is composed will
23+
most likely increase the number of these kind of false positives, requiring an
24+
automated solution to avoid an explosion in the signal-to-noise ratio of
25+
security scans. Hence VEX.
26+
27+
## The VEX Impact Statement
28+
29+
VEX centers on the notion of an _impact statement_. In short, an impact statement
30+
can be summarized as the sum of a product, a vulnerability, and a status:
31+
32+
```
33+
impact_statement = product(s) + vulnerability + status
34+
│ │ │
35+
└ The software product └ Typically a CVE related └ One of the impact
36+
we are talking about to one of the product's statuses as identified
37+
components by the VEX working group.
38+
```
39+
40+
The `product` is a piece of software that can be correlated to an entry in an
41+
SBOM (see [Product](#Product) below). `vulnerability` is the ID of a security
42+
vulnerability as understood by scanners and that can be looked up in a vulnerability
43+
tracking system. `status` is one of the impact status labels defined by VEX
44+
(see [Status](#Status)).
45+
46+
Another key part of VEX is time. It matters _when_ statements are made. VEX is
47+
designed to be a sequence of statements, each overriding, but also enriching
48+
the previous ones with new information. Each impact statement has a timestamp
49+
associated with it.
50+
51+
## VEX Documents
52+
53+
A VEX document is a data structure grouping one or more impact statements.
54+
Documents also have timestamps, which may cascade down to statements (see
55+
[inheritance](#Inheritance)) and can also be versioned.
56+
57+
### A Sample Scenario
58+
59+
As an example, consider the following evolution of a hypothetical impact analysis:
60+
61+
1. A software author becomes aware of a new CVE related to its product. Immediately,
62+
the author starts to check if it affects them.
63+
2. The investigation determines the product is affected.
64+
3. To protect their users, the author mitigates the CVE impact via a patch or
65+
other method before the vulnerable component issues a patch.
66+
67+
During these three steps, users scanning the author's software will simply get
68+
a thirds party alert with no details on how the status is evolving. Most critically,
69+
when (in #3) the software is patched, the alert becomes a false positive.
70+
71+
With VEX, the author can issue a VEX document when the CVE is published to
72+
inform their users it is under investigation. In #2, when the product is known
73+
to be affected, the author can ship a new VEX document, stating the product
74+
is affected and possibly some additional advice. Finally, when patched
75+
the product's SBOM can be complemented with a VEX document informing it is no
76+
longer affected by the CVE. Scanners could consume this document and stop
77+
alerting about the CVE as it no longer impacts the product.
78+
79+
## OpenVEX Specification
80+
81+
### Definitions
82+
83+
#### Document
84+
85+
A data structure that groups together one or more impact statements. A document
86+
MUST define a timestamp to express when its statements were known to be true.
87+
88+
#### Encapsulating Document
89+
90+
While OpenVEX defines a self-sustaining document format, VEX data can often be
91+
found embedded or incorporated in other formats, examples of this include
92+
in-toto attestations or CSAF and CycloneDX documents. "Encapsulating document"
93+
refers to these formats that can contain VEX data.
94+
95+
#### Product
96+
97+
A logical unit representing a piece of software. The concept is intentionally
98+
broad to allow for a wide variety of use cases but generally speaking, anything
99+
that can be described in a Software Bill of Materials (SBOM) can be thought of
100+
as a product.
101+
102+
#### Status
103+
104+
The known relationship a vulnerability has with a software product. The status
105+
expresses if the product is impacted by the vulnerability or not, if the authors
106+
are investigating it, or if it has already been fixed.
107+
108+
#### Vulnerability
109+
110+
A cataloged defect in a software product. Documents SHOULD use global, well-known
111+
identifying schemas. For internal identifying schemes, the only requirement
112+
for a vulnerability to be listed in a VEX document is that it needs to have an ID
113+
string to address it. Public identifiers (such as CVE IDs) are the most
114+
common case, but private internal identifiers can be used if they are
115+
understood by all participants of the supply chain where the VEX metadata is
116+
expected to flow.
117+
118+
#### Subcomponent
119+
120+
Any components possibly included in the product where the vulnerability origintes.
121+
The subcomponents SHOULD also be software identifiers and they SHOULD also be
122+
listed in the product SBOM. subcomponents will most often be one or more of the
123+
product's dependencies.
124+
125+
### Document
126+
127+
A VEX document consists of two parts: The document metadata and a collection
128+
of impact statements. Some fields in the document metadata are required.
129+
130+
OpenVEX documents are serialized in json-ld structs. File encoding MUST be UTF8.
131+
132+
Here is a sample of a minimal OpenVEX document:
133+
134+
```json
135+
{
136+
"@context": "https://openvex.dev/schema-ld.json",
137+
"@id": "VEX-9fb3463de1b57",
138+
"author": "Wolfi J Inkinson",
139+
"role": "Document Creator",
140+
"timestamp": "2023-01-08T18:02:03.647787998-06:00",
141+
"version": "1",
142+
"statements": [
143+
{
144+
"vulnerability": "CVE-2023-12345",
145+
"products": [
146+
"pkg:apk/wolfi/[email protected]?arch=armv7",
147+
"pkg:apk/wolfi/[email protected]?arch=x86_64"
148+
],
149+
"status": "fixed"
150+
}
151+
]
152+
}
153+
154+
```
155+
156+
#### Document Struct Fields
157+
158+
The following table lists the fields in the document struct
159+
160+
| Field | Required | Description |
161+
| --- | --- | --- |
162+
| @id || id is the identifying string for the VEX document. This should be unique per document. |
163+
| author || Author is the identifier for the author of the VEX statement. Ideally, a common name, may be a URI. `author` can be an individual or organization. The author identity SHOULD be cryptographically associated with the signature of the VEX statement or document or transport. |
164+
| role || role describes the role of the document author. |
165+
| timestamp || Timestamp defines the time at which the document was issued. |
166+
| version || Version is the document version. It must be incremented when any content within the VEX document changes, including any VEX statements included within the VEX document. |
167+
| tooling || Tooling expresses how the VEX document and contained VEX statements were generated. It's optional. It may specify tools or automated processes used in the document or statement generation. |
168+
| supplier || An optional field specifying who is providing the VEX document. |
169+
170+
### Impact Statement
171+
172+
An impact statement is an assertion made by the document's author about the impact
173+
a vulnerability has on one or more software "products". The impact statement has
174+
three key components that are valid at a point in time: `status`, a `vulnerability`,
175+
and the `product` to which these apply (see diagram above).
176+
177+
An impact statement in an OpenVEX document looks like the following snippet:
178+
179+
```
180+
TBD
181+
```
182+
183+
#### Impact Statement Fields
184+
185+
The following table lists the fields of the impact statement struct.
186+
187+
| Field | Required | Description |
188+
| --- | --- | --- |
189+
| vulnerability | ✓ | vulnerability SHOULD use existing and well known identifiers. For example: CVE, the Global Security Database (GSD), or a supplier’s vulnerability tracking system. It is expected that vulnerability identification systems are external to and maintained separately from VEX.<br>vulnerability MAY be URIs or URLs.<br>vulnerability MAY be arbitrary and MAY be created by the VEX statement `author`.
190+
| vuln_description || Optional free-form text describing the vulnerability |
191+
| timestamp || Timestamp is the time at which the information expressed in the Statement was known to be true. Cascades down from the document, see [Inheritance](#Inheritance). |
192+
| products || Product identifiers that the statement applies to. Any software identifier can be used and SHOULD be traceable to a described item in an SBOM. The use of [Package URLs](https://github.com/package-url/purl-spec) (purls) is recommended. While a product identifier is required to have a complete statement, this field is optional as it can cascade down from the encapsulating document, see [Inheritance](#Inheritance). |
193+
| subcomponents || Identifiers of components where the vulnerability originates. While the statement asserts about the impact on the software product, listing `subcomponents` let scanners find identifiers to match their findings. |
194+
| status || A VEX statement MUST provide the status of the vulnerabilities with respect to the products and components listed in the statement. `status` MUST be one of the labels defined by VEX (see [Status](#Status)), some of which have further options and requirements. |
195+
| status_notes || A statement MAY convey information about how `status` was determined and MAY reference other VEX information. |
196+
| justification || For statements conveying a `not_affected` status, a VEX statement MUST include a status justification informing why the product is not affected by the vulnerability. Justifications are fixed labels defined by VEX. See [Status Justifications](#Status Justifications) below for valid values. |
197+
| impact_statement || When a product is `not_affected`, the VEX document author MAY include a statement that contains a description of why the vulnerability cannot be exploited. |
198+
| action_statement || For a statement with "affected" status, a VEX statement MAY include a statement that SHOULD describe actions to remediate or mitigate the vulnerability. |
199+
| action_statement_timestamp || The timestamp when the action statement was issued. |
200+
201+
### Status Labels
202+
203+
Status labels inform the impact of a vulnerability in the products listed
204+
in a statement. Security tooling such as vulnerability scanners consuming OpenVEX
205+
documents can key on the status labels to alter their behavior when a vulnerable
206+
component is detected. Security dashboards can provide users and auditors
207+
with contextual data about the evolution of the vulnerability impact.
208+
209+
| Label | Description |
210+
| --- | --- |
211+
| `not_affected` | No remediation is required regarding this vulnerability. A `not_affected` status required the addition of a `justification` to the impact statement. |
212+
| `affected` | Actions are recommended to remediate or address this vulnerability. |
213+
| `fixed` | These product versions contain a fix for the vulnerability. |
214+
| `under_investigation` | It is not yet known whether these product versions are affected by the vulnerability. Updates should be provided in further VEX documents as knowledge evolves. |
215+
216+
Any of these key data points are required to form a valid impact statement but
217+
they are not necessarily required to be defined in the statement's data struct.
218+
Consider the following scenarios:
219+
220+
### Status Justifications
221+
222+
When assessing risk, consumers of a `not_affected` software product can know
223+
why the vulnerability is not affected by reading the justification label
224+
associated with the impact statement. These labels are predefined and machine-readable
225+
to enable automated uses such as deployment policies. The current label catalog
226+
was defined by the VEX Working Group and published in the
227+
[Status Justifications](status-doc) document on July 2022.
228+
229+
230+
| Label | Description |
231+
| --- | --- |
232+
| `component_not_present` | The product is not affected by the vulnerability because the component is not included. The status justification may be used to preemptively inform product users who are seeking to understand a vulnerability that is widespread, receiving a lot of attention, or is in similar products. |
233+
| `vulnerable_code_not_present` | The vulnerable component is included in artifact, but the vulnerable code is not present. Typically, this case occurs when source code is configured or built in a way that excluded the vulnerable code. |
234+
| `vulnerable_code_not_in_execute_path` | The vulnerable code (likely in `subcomponents`) can not be executed as it is used by the product.<br><br>Typically, this case occurs when the product includes the vulnerable `subcomponent` but does not call or use the vulnerable code. |
235+
| `vulnerable_code_cannot_be_controlled_by_adversary` | The vulnerable code cannot be controlled by an attacker to exploit the vulnerability.<br><br> This justification could be difficult to prove conclusively. |
236+
| `inline_mitigations_already_exist` | The product includes built-in protections or features that prevent exploitation of the vulnerability. These built-in protections cannot be subverted by the attacker and cannot be configured or disabled by the user. These mitigations completely prevent exploitation based on known attack vectors.<br><br>This justification could be difficult to prove conclusively. History is littered with examples of mitigation bypasses, typically involving minor modifications of existing exploit code.
237+
238+
## Data Inheritance
239+
240+
VEX statements can inherit values from their document and/or, when embedded or
241+
incorporated into another format, from its [encapsulating document](#encaspu).
242+
243+
A valid impact statement needs to have four key data points which act as
244+
the grammatical parts of a sentence:
245+
246+
- One or more products. These are the direct objects of the statement.
247+
- A status. The status can be thought of as the verb.
248+
- A vulnerability. The vulnerability is the indirect object.
249+
- A timestamp. This is the time complement of the statement. A statement is useless without a timestamp as it cannot be related to others talking about the same subject.
250+
251+
In OpenVEX, timestamps and product identifiers can be defined outside the
252+
statements to avoid defining redundant info or to leverage external features.
253+
254+
__Note:__ While this specification lists these data fields as optional in the
255+
statement data struct, the data MUST be defined to have complete
256+
impact statements. A document with incomplete impact statements is not valid.
257+
258+
#### Data Economy
259+
260+
A document defining four impact statements, all issued at the same time can be
261+
made less verbose by just inferring the statement timestamps from the date the
262+
document was issued.
263+
264+
#### Encapsulating Format
265+
266+
VEX is designed to be encapsulated in other document formats which may have
267+
redundant features or be better at expressing the required data points. For
268+
example, an in-toto attestation can contain a VEX document in its predicate
269+
while its subject section lists the software the VEX data applies to.
270+
271+
Another example is CSAF. The format defines a sophisticated tree that
272+
can specify complex groups and families of products. In this case, product
273+
identification can be left blank in the VEX statement and leverage CSAF's
274+
powerful product tree features.
275+
276+
### Inheritance Flow
277+
278+
As mentioned data specifying a statement's product or timestamp can originate
279+
outside. As the data cascades, more specific elements can override the data
280+
defined in more general ones. The following two phrases define how the
281+
inheritance flow works:
282+
283+
#### Timestamps
284+
285+
A timestamp in a `statement` entry overrides a timestamp defined at the
286+
document level which in turn overrides timestamps defined on the encapsulating
287+
document.
288+
289+
#### Product ID
290+
291+
A product identifier defined in a `statement` entry overrides any product
292+
identification data defined on the encapsulating document.
293+
294+
### Updating Statements with Inherited Data
295+
296+
When updating a document with statements with data implied via inheritance,
297+
the integrity of the untouched statements MUST be preserved. In the following
298+
example, the sole statement has its timestamp data derived from the document:
299+
300+
```json
301+
{
302+
"@context": "https://openvex.dev/schema-ld.json",
303+
"@id": "VEX-9fb3463de1b57",
304+
"author": "Unknown Author",
305+
"role": "Document Creator",
306+
"timestamp": "2023-01-08T18:02:03-06:00",
307+
"version": "1",
308+
"statements": [
309+
{
310+
"vulnerability": "CVE-2023-12345",
311+
"products": [
312+
"pkg:apk/wolfi/[email protected]?arch=armv7",
313+
],
314+
"status": "under_investigation"
315+
}
316+
]
317+
}
318+
```
319+
320+
When adding a second statement, the document date needs to be updated, but to
321+
preserve the integrity of the original statement we need to keep the original
322+
document timestamp. The newly added statement can inherit the document's date
323+
to avoid duplication:
324+
325+
```json
326+
{
327+
"@context": "https://openvex.dev/schema-ld.json",
328+
"@id": "VEX-6ea13336fa2ffb7",
329+
"author": "Unknown Author",
330+
"role": "Document Creator",
331+
"timestamp": "2023-01-09T09:08:42-06:00",
332+
"version": "1",
333+
"statements": [
334+
{
335+
"timestamp": "2023-01-08T18:02:03-06:00",
336+
"vulnerability": "CVE-2023-12345",
337+
"products": [
338+
"pkg:apk/wolfi/[email protected]?arch=armv7",
339+
],
340+
"status": "under_investigation"
341+
},
342+
{
343+
"vulnerability": "CVE-2023-12345",
344+
"products": [
345+
"pkg:apk/wolfi/[email protected]?arch=armv7",
346+
],
347+
"status": "fixed"
348+
},
349+
]
350+
}
351+
```
352+
353+
## Revisions
354+
355+
| Date | Revision |
356+
| --- | --- |
357+
| 2023-01-08 | First Draft of the OpenVEX Specification |
358+
359+
## Sources
360+
361+
status-doc: https://www.cisa.gov/sites/default/files/publications/VEX_Status_Justification_Jun22.pdf

0 commit comments

Comments
 (0)