Skip to content

Commit 2f05e56

Browse files
committed
feat: Add script to generate OpenVEX file
This change introduces the `generate_openvex.py` script, which converts the `VEX.cyclonedx.xml` file into a compliant OpenVEX JSON document. ### Highlights * Adds a Python script to automate VEX conversion from CycloneDX format to OpenVEX. * Generates a fully populated OpenVEX document based on vulnerability analysis data in `VEX.cyclonedx.xml`. ### Additional Fixes * Corrects a non-unique `serialNumber` (UUID) that was mistakenly copy-pasted from `commons-bcel`. * Removes unintended indentation from the explanation text, ensuring valid Markdown formatting.
1 parent f5908a7 commit 2f05e56

File tree

4 files changed

+229
-8
lines changed

4 files changed

+229
-8
lines changed

src/conf/security/README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,10 @@ An experimental [VEX](https://cyclonedx.org/capabilities/vex/) document is also
4040

4141
👉 [`https://raw.githubusercontent.com/apache/commons-text/refs/heads/master/src/conf/security/VEX.cyclonedx.xml`](VEX.cyclonedx.xml)
4242

43+
It is also available in [OpenVEX format](https://github.com/openvex/spec) at:
44+
45+
👉 [`https://raw.githubusercontent.com/apache/commons-text/refs/heads/master/src/conf/security/openvex.json`](openvex.json)
46+
4347
This document provides information about the **exploitability of known vulnerabilities** in the **dependencies** of Apache Commons Text.
4448

4549
### When is a dependency vulnerability exploitable?
@@ -59,3 +63,13 @@ Because Apache Commons libraries (including Text) do **not** bundle their depend
5963
* The `analysis` field in the VEX file uses **Markdown** formatting.
6064

6165
For more information about CycloneDX, SBOMs, or VEX, visit [cyclonedx.org](https://cyclonedx.org/).
66+
67+
## Contributing
68+
69+
To add or update a VEX entry:
70+
71+
* Edit the CycloneDX VEX document:
72+
1. Increase the `version` attribute in the `<bom>` element.
73+
2. Update the `timestamp` in the `<metadata>` section.
74+
3. Make your changes to the vulnerability information.
75+
* Regenerate the `openvex.json` file by running the `generate-openvex.sh` script.

src/conf/security/VEX.cyclonedx.xml

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,13 @@
1919
To update this document:
2020
1. Increment the `version` attribute in the <bom> element.
2121
2. Update the `timestamp` in the <metadata> section.
22+
3. Regenerate the `openvex.json` file using the `generate-openvex.sh` script.
2223
-->
2324
<bom xmlns="http://cyclonedx.org/schema/bom/1.6"
2425
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
2526
xsi:schemaLocation="http://cyclonedx.org/schema/bom/1.6 https://cyclonedx.org/schema/bom-1.6.xsd"
26-
serialNumber="urn:uuid:f70dec29-fc7d-41f2-8c60-97e9075e0e73"
27-
version="1">
27+
serialNumber="urn:uuid:9d64577b-0376-4ee7-b154-5ec26a1803f4"
28+
version="2">
2829

2930
<metadata>
3031
<timestamp>2025-07-29T12:26:42Z</timestamp>
@@ -51,6 +52,10 @@
5152
<vulnerabilities>
5253
<vulnerability>
5354
<id>CVE-2025-48924</id>
55+
<source>
56+
<name>NVD</name>
57+
<url>https://nvd.nist.gov/vuln/detail/CVE-2025-48924</url>
58+
</source>
5459
<references>
5560
<reference>
5661
<id>GHSA-j288-q9x7-2f5v</id>
@@ -65,14 +70,14 @@
6570
<response>update</response>
6671
</responses>
6772
<detail>
68-
CVE-2025-48924 is exploitable in Apache Commons Text versions 1.5 and later, but only when all the following conditions are met:
73+
CVE-2025-48924 is exploitable in Apache Commons Text versions 1.5 and later, but only when all the following conditions are met:
6974

70-
* The consuming project includes a vulnerable version of Commons Text on the classpath.
71-
As of version `1.14.1`, Commons Text no longer references a vulnerable version of the `commons-lang3` library in its POM file.
72-
* Unvalidated or unsanitized user input is passed to the `StringSubstitutor` or `StringLookup` classes.
73-
* An interpolator lookup created via `StringLookupFactory.interpolatorLookup()` is used.
75+
* The consuming project includes a vulnerable version of Commons Text on the classpath.
76+
As of version `1.14.1`, Commons Text no longer references a vulnerable version of the `commons-lang3` library in its POM file.
77+
* Unvalidated or unsanitized user input is passed to the `StringSubstitutor` or `StringLookup` classes.
78+
* An interpolator lookup created via `StringLookupFactory.interpolatorLookup()` is used.
7479

75-
If these conditions are satisfied, an attacker may cause an infinite loop by submitting a specially crafted input such as `${const:...}`.
80+
If these conditions are satisfied, an attacker may cause an infinite loop by submitting a specially crafted input such as `${const:...}`.
7681
</detail>
7782
<firstIssued>2025-07-29T12:26:42Z</firstIssued>
7883
<lastUpdated>2025-07-29T12:26:42Z</lastUpdated>
Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
#!/usr/bin/env python3
2+
import xml.etree.ElementTree as ET
3+
import json
4+
from datetime import datetime, timezone
5+
6+
NAMESPACES = {
7+
'b': 'http://cyclonedx.org/schema/bom/1.6'
8+
}
9+
10+
11+
def _find_element(parent: ET.Element, tag: str) -> ET.Element | None:
12+
return parent.find(tag, NAMESPACES)
13+
14+
15+
def _find_stripped_text(parent: ET.Element, tag: str) -> str | None:
16+
el = _find_element(parent, tag)
17+
return el.text.strip() if el is not None else None
18+
19+
20+
def _add_optional_date(parent: ET.Element, tag: str, target: dict, key: str) -> None:
21+
el = _find_element(parent, tag)
22+
if el is not None and el.text:
23+
try:
24+
dt = datetime.fromisoformat(el.text.strip()).astimezone(timezone.utc)
25+
target[key] = dt.isoformat().replace('+00:00', 'Z')
26+
except ValueError as e:
27+
raise ValueError(f"Invalid ISO date format in <{tag}>: {el.text}") from e
28+
29+
30+
def load_cyclonedx(path: str = 'VEX.cyclonedx.xml') -> ET.Element:
31+
return ET.parse(path).getroot()
32+
33+
34+
def to_openvex(root: ET.Element) -> dict:
35+
serial_number = root.get('serialNumber')
36+
if not serial_number:
37+
raise ValueError("CycloneDX BOM must have a 'serialNumber' attribute")
38+
39+
version = int(root.get('version', '1'))
40+
41+
result = {
42+
'@context': 'https://openvex.dev/ns/v0.2.0',
43+
'@id': f"https://commons.apache.org/security/vex/{serial_number}",
44+
'author': 'Apache Commons Security Team <[email protected]>',
45+
'role': 'Security Team',
46+
'version': version,
47+
'tooling': (
48+
"This document was automatically converted from the `VEX.cyclonedx.xml` file.\n"
49+
"Do not edit this file directly, run `generate_openvex.py` to regenerate it."
50+
)
51+
}
52+
53+
_add_optional_date(root, 'b:metadata/b:timestamp', result, 'timestamp')
54+
55+
component = _find_element(root, 'b:metadata/b:component')
56+
if component is None:
57+
raise ValueError("Missing <component> in <metadata>")
58+
59+
product = to_openvex_product(component)
60+
61+
result['statements'] = [
62+
to_openvex_statement(vuln, product)
63+
for vuln in root.findall('.//b:vulnerability', NAMESPACES)
64+
]
65+
66+
return result
67+
68+
69+
def to_openvex_product(component: ET.Element) -> dict:
70+
purl = _find_element(component, 'b:purl')
71+
if purl is None or not purl.text:
72+
raise ValueError("Component must include a non-empty <purl> element")
73+
74+
return {
75+
'@id': purl.text,
76+
'identifiers': {
77+
'purl': purl.text
78+
}
79+
}
80+
81+
82+
def to_openvex_vulnerability(vuln: ET.Element) -> dict:
83+
cdx_id = _find_stripped_text(vuln, 'b:id')
84+
if not cdx_id:
85+
raise ValueError("Vulnerability must have an <id>")
86+
87+
entry = {'name': cdx_id}
88+
89+
source = _find_element(vuln, 'b:source')
90+
if source is not None:
91+
entry['@id'] = _find_stripped_text(source, 'b:url')
92+
93+
entry['aliases'] = [
94+
_find_stripped_text(ref, 'b:id')
95+
for ref in vuln.findall('b:references/b:reference', NAMESPACES)
96+
]
97+
98+
return entry
99+
100+
101+
def to_openvex_statement(vuln: ET.Element, product: dict) -> dict:
102+
analysis = _find_element(vuln, 'b:analysis')
103+
if analysis is None:
104+
raise ValueError("Missing <analysis> in vulnerability")
105+
106+
state = _find_stripped_text(analysis, 'b:state')
107+
if not state:
108+
raise ValueError("Missing <state> in vulnerability analysis")
109+
110+
statement = {
111+
'products': [product],
112+
'vulnerability': to_openvex_vulnerability(vuln),
113+
'status': to_openvex_status(state)
114+
}
115+
116+
justification = _find_stripped_text(analysis, 'b:justification')
117+
if justification:
118+
statement['justification'] = to_openvex_justification(justification)
119+
120+
detail = _find_stripped_text(analysis, 'b:detail')
121+
if detail:
122+
statement['status_notes'] = detail
123+
124+
_add_optional_date(analysis, 'b:firstIssued', statement, 'timestamp')
125+
_add_optional_date(analysis, 'b:lastUpdated', statement, 'last_updated')
126+
127+
return statement
128+
129+
130+
def to_openvex_status(cdx_status: str) -> str:
131+
mapping = {
132+
"resolved": "fixed",
133+
"exploitable": "affected",
134+
"in_triage": "under_investigation",
135+
"false_positive": "not_affected",
136+
"not_affected": "not_affected"
137+
}
138+
status = mapping.get(cdx_status.strip().lower())
139+
if not status:
140+
raise ValueError(f"Unknown CycloneDX status: '{cdx_status}'")
141+
return status
142+
143+
144+
def to_openvex_justification(cdx_justification: str) -> str:
145+
mapping = {
146+
"code_not_present": "vulnerable_code_not_present",
147+
"code_not_reachable": "vulnerable_code_not_in_execute_path",
148+
"requires_configuration": "vulnerable_code_cannot_be_controlled_by_adversary",
149+
"requires_dependency": "component_not_present",
150+
"requires_environment": "vulnerable_code_cannot_be_controlled_by_adversary",
151+
"protected_by_compiler": "inline_mitigations_already_exist",
152+
"protected_at_runtime": "inline_mitigations_already_exist",
153+
"protected_by_mitigating_control": "inline_mitigations_already_exist"
154+
}
155+
result = mapping.get(cdx_justification.strip().lower())
156+
if not result:
157+
raise ValueError(f"Unknown CycloneDX justification: '{cdx_justification}'")
158+
return result
159+
160+
161+
def main():
162+
cyclonedx_root = load_cyclonedx()
163+
openvex_doc = to_openvex(cyclonedx_root)
164+
with open('openvex.json', 'w') as f:
165+
json.dump(openvex_doc, f, indent=2)
166+
print("OpenVEX document written to 'openvex.json'")
167+
168+
169+
if __name__ == "__main__":
170+
main()

src/conf/security/openvex.json

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
{
2+
"@context": "https://openvex.dev/ns/v0.2.0",
3+
"@id": "https://commons.apache.org/security/vex/urn:uuid:9d64577b-0376-4ee7-b154-5ec26a1803f4",
4+
"author": "Apache Commons Security Team <[email protected]>",
5+
"role": "Security Team",
6+
"version": 2,
7+
"tooling": "This document was automatically converted from the `VEX.cyclonedx.xml` file.\nDo not edit this file directly, run `generate_openvex.py` to regenerate it.",
8+
"timestamp": "2025-07-29T12:26:42Z",
9+
"statements": [
10+
{
11+
"products": [
12+
{
13+
"@id": "pkg:maven/org.apache.commons/commons-text?type=jar",
14+
"identifiers": {
15+
"purl": "pkg:maven/org.apache.commons/commons-text?type=jar"
16+
}
17+
}
18+
],
19+
"vulnerability": {
20+
"name": "CVE-2025-48924",
21+
"@id": "https://nvd.nist.gov/vuln/detail/CVE-2025-48924",
22+
"aliases": [
23+
"GHSA-j288-q9x7-2f5v"
24+
]
25+
},
26+
"status": "affected",
27+
"status_notes": "CVE-2025-48924 is exploitable in Apache Commons Text versions 1.5 and later, but only when all the following conditions are met:\n\n* The consuming project includes a vulnerable version of Commons Text on the classpath.\n As of version `1.14.1`, Commons Text no longer references a vulnerable version of the `commons-lang3` library in its POM file.\n* Unvalidated or unsanitized user input is passed to the `StringSubstitutor` or `StringLookup` classes.\n* An interpolator lookup created via `StringLookupFactory.interpolatorLookup()` is used.\n\nIf these conditions are satisfied, an attacker may cause an infinite loop by submitting a specially crafted input such as `${const:...}`.",
28+
"timestamp": "2025-07-29T12:26:42Z",
29+
"last_updated": "2025-07-29T12:26:42Z"
30+
}
31+
]
32+
}

0 commit comments

Comments
 (0)