Skip to content

Commit 218e5f4

Browse files
author
Vadim Bogulean
committed
End-to-end CVE history support and hardens loading/startup
- Added end-to-end cvehist support: DB table/migration, models, loader, CLI search, and /api/search/cvehist. - Extended search filters for CVE history: CVE ID, change dates, change-event, and change-type. - Hardened loading: upserts for CVE/CPE/history, timeouts, resume support, and cleaner EPSS sync/error handling. - Improved startup: DB schema migration now runs automatically before the web app starts. - Updated config/docs for the CVE history API and new load/search commands.
1 parent 4e49dc9 commit 218e5f4

File tree

15 files changed

+706
-139
lines changed

15 files changed

+706
-139
lines changed

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@ COPY ./src ${FCDB_HOME}
1919

2020
EXPOSE 8000
2121

22-
CMD ["uvicorn", "web.app:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
22+
CMD ["/bin/sh", "-c", "python -m web.prestart && exec uvicorn web.app:app ${FCDB_WEB_PARAMS:---host 0.0.0.0 --port 8000 --workers 4}"]

README.md

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -147,17 +147,17 @@ Restore:
147147
docker compose exec -T fastcve-db sh -c 'pg_restore -U "$POSTGRES_USER" -d "$POSTGRES_DB" --clean --if-exists "/backup/fastcve_vuln_db_YYYY-MM-DD.dump"'
148148
```
149149

150-
- **Populate the DB for the first time (CVE/CPE/CWE/CAPEC/EPSS/KEV)**:
150+
- **Populate the DB for the first time (CVE/CVE history/CPE/CWE/CAPEC/EPSS/KEV)**:
151151
```bash
152-
docker compose exec fastcve load --data cve cpe cwe capec epss kev
152+
docker compose exec fastcve load --data cve cvehist cpe cwe capec epss kev
153153
```
154154

155-
- **Update CVE/CPE incrementally (and refresh EPSS/KEV)**:
155+
- **Update CVE/CVE history/CPE incrementally (and refresh EPSS/KEV)**:
156156
```bash
157-
docker compose exec fastcve load --data cve cpe epss kev
157+
docker compose exec fastcve load --data cve cvehist cpe epss kev
158158
```
159159

160-
This fetches changes since the last successful update (for CVE/CPE), with an upper limit of `fetch.max.days.period` (default 120 days) enforced by the loader.
160+
This fetches changes since the last successful update (for CVE/CVE history/CPE), with an upper limit of `fetch.max.days.period` (default 120 days) enforced by the loader.
161161

162162
If there is a need to repopulate the DB for the CWE/CAPEC info, then `--full` and `--drop` options are available for the load command. `--full` ignores the fact the data is already present and `--drop` drops existing data before loading. When using `--data epss` in combination with `--epss-now`, the loader fetches EPSS data for the current date; otherwise it defaults to the previous day.
163163

@@ -195,6 +195,14 @@ Additional filters are available for CVE search:
195195
--last-mod-start-date # retrieve only those CVEs that are last modified after the start date
196196
--last-mod-end-date # retrieve only those CVEs that are last modified before the end date
197197
```
198+
199+
- search for the data: **get CVE history rows for a CVE or change pattern**
200+
```
201+
docker compose exec fastcve search --search-info cvehist --cve CVE-2024-3094 --change-event 'Initial Analysis' --output json
202+
```
203+
204+
Above will return the matching CVE history change records. `cvehist` supports filtering by CVE ID, change date range, `--change-event`, and `--change-type`.
205+
198206
- search for the data: **get the valid list of CPE names for a query on part/vendor/product/version etc**.
199207

200208
```
@@ -216,6 +224,7 @@ The following endpoints are exposed through HTTP requests
216224
```
217225
/status - DB status
218226
/api/search/cve - search for CVE data
227+
/api/search/cvehist - search for CVE history data
219228
/api/search/cpe - search for CPE data
220229
/api/search/cwe - search for CWE data
221230
/api/search/capec - search for CAPEC data

src/common/models/cve_history.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
from datetime import datetime
2+
from typing import List, Optional, Any
3+
from pydantic import BaseModel
4+
5+
6+
class CveChangeDetail(BaseModel):
7+
class Config:
8+
extra = 'allow' # details are not fully stable
9+
10+
type: Optional[str] = None
11+
oldValue: Optional[Any] = None
12+
newValue: Optional[Any] = None
13+
14+
15+
16+
class CveHistoryItem(BaseModel):
17+
class Config:
18+
extra = 'ignore'
19+
20+
cveId: str
21+
cveChangeId: str
22+
eventName: str
23+
sourceIdentifier: Optional[str] = None
24+
created: datetime
25+
details: Optional[List[CveChangeDetail]] = None

src/common/models/models.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,8 @@ class SearchInfoType(str, Enum):
7575
cpe = "cpe"
7676
capec = "capec"
7777
status = "status"
78+
cvehist = "cvehist"
79+
7880

7981

8082
class CveSeverityV2(str, Enum):
@@ -129,6 +131,8 @@ class SearchOptions(BaseModel):
129131
lastModEndDate: Optional[date] = Field(default=None, description="Last modified end date", alias="last-mod-end-date")
130132
pubStartDate: Optional[date] = Field(default=None, description="CVE Published start date", alias="pub-start-date")
131133
pubEndDate: Optional[date] = Field(default=None, description="CVE Published start date", alias="pub-end-date")
134+
changeEvent: Optional[str] = Field(default=None, description="Regexp to filter CVE history by change event name (e.g. 'CVE Modified', 'Initial Analysis')", cli=('--change-event',),alias="change-event")
135+
changeType: Optional[str] = Field(default=None, description="Regexp to filter CVE history by change detail type (e.g. 'Reference', 'CVSS V3.1')", cli=('--change-type',), alias="change-type")
132136
vulnerable: Optional[bool] = Field(default=True, description="CVE found by the CPEs that are marked as vulnerable", alias="vulnerable")
133137
pageSize: Optional[int] = Field(description="Number of results per page", default=100, alias="page-size", ge=10, le=3000)
134138
pageIdx: Optional[int] = Field(default=0, description="Starting index", alias="page-idx", ge=0)
@@ -153,8 +157,8 @@ def validate_cpe_name(cls, value):
153157
def validate_mandatory_fields(cls, inst):
154158
"""Implement the root validator"""
155159

156-
# Validate input parameters in case of search-info set as CVE
157-
if inst.get('searchInfo', None) in (SearchInfoType.cve, SearchInfoType.cpe):
160+
# Validate input parameters in case of search-info set as CVE/CPE/CveHistory
161+
if inst.get('searchInfo', None) in (SearchInfoType.cve, SearchInfoType.cpe, SearchInfoType.cvehist):
158162
if inst['lastModStartDate'] and inst['lastModEndDate'] and inst['lastModStartDate'] > inst['lastModEndDate']:
159163
exc = ValueError('Last modified start date must be before last modified end date')
160164
raise ValidationError([ErrorWrapper(exc, loc=cls.__fields__['lastModStartDate'].alias)], cls)

src/common/search.py

Lines changed: 124 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,27 @@
1111

1212
import re
1313
import json
14+
from functools import lru_cache
1415
from typing import List, Iterator
1516
from sqlalchemy import Boolean, cast, Numeric, select
1617
from sqlalchemy.sql import text, expression
1718
from sqlalchemy.orm import aliased
1819
from generic import ApplicationContext
19-
from db.tables import Vuln, VulnCpes, Cpe, Cwe, FetchStatus, Capec
20+
from db.tables import Vuln, VulnCpes, Cpe, Cwe, FetchStatus, Capec, VulnHistory
2021
from common.models import SearchOptions, SearchInfoType, OutputType, CPE23_REGEX_STR, metrics_mapping
22+
from datetime import datetime, time, timezone
2123

2224
class ValidationError(Exception): ...
25+
class MissingSearchDataError(ValidationError): ...
26+
27+
28+
SEARCH_INFO_DATASETS = {
29+
SearchInfoType.cve: dict(model=Vuln, label="CVE", load_arg="cve"),
30+
SearchInfoType.cpe: dict(model=Cpe, label="CPE", load_arg="cpe"),
31+
SearchInfoType.cwe: dict(model=Cwe, label="CWE", load_arg="cwe"),
32+
SearchInfoType.capec: dict(model=Capec, label="CAPEC", load_arg="capec"),
33+
SearchInfoType.cvehist: dict(model=VulnHistory, label="CVE history", load_arg="cvehist"),
34+
}
2335

2436
# regex used to split the cpe 2.3 into separate pieces
2537
COLUMN_REGEX = re.compile(r'(?<!\\):')
@@ -30,6 +42,41 @@ def get_non_empty_opts(opts: SearchOptions) -> dict:
3042
return {k: v for k, v in vars(opts).items() if v is not None}
3143

3244

45+
# ------------------------------------------------------------------------------
46+
def get_missing_search_data_message(search_info: SearchInfoType) -> str:
47+
48+
dataset = SEARCH_INFO_DATASETS[search_info]
49+
return (
50+
f"No {dataset['label']} data is loaded in the database for search-info={search_info.value}. "
51+
f"Load it first with: load --data {dataset['load_arg']}"
52+
)
53+
54+
55+
# ------------------------------------------------------------------------------
56+
def get_search_data_cache_bucket() -> str:
57+
return datetime.now(timezone.utc).strftime("%Y-%m-%d")
58+
59+
60+
# ------------------------------------------------------------------------------
61+
@lru_cache(maxsize=len(SEARCH_INFO_DATASETS))
62+
def _ensure_search_data_loaded_cached(search_info: SearchInfoType, cache_bucket: str) -> SearchInfoType:
63+
64+
dataset = SEARCH_INFO_DATASETS.get(search_info)
65+
if not dataset:
66+
return search_info
67+
68+
with ApplicationContext.instance().db as session:
69+
row = session.execute(select(dataset['model'].id).limit(1)).first()
70+
if row is None:
71+
raise MissingSearchDataError(get_missing_search_data_message(search_info))
72+
return search_info
73+
74+
75+
# ------------------------------------------------------------------------------
76+
def ensure_search_data_loaded(appctx: ApplicationContext, search_info: SearchInfoType) -> None:
77+
_ensure_search_data_loaded_cached(search_info, get_search_data_cache_bucket())
78+
79+
3380
# ------------------------------------------------------------------------------
3481
def search_cves(appctx: ApplicationContext, opts: SearchOptions):
3582

@@ -135,6 +182,79 @@ def search_cves(appctx: ApplicationContext, opts: SearchOptions):
135182

136183
return result
137184

185+
# ------------------------------------------------------------------------------
186+
def search_cvehist(appctx: ApplicationContext, opts: SearchOptions):
187+
188+
result = {}
189+
vh_table = aliased(VulnHistory, name='vh_table')
190+
191+
with appctx.db as session:
192+
193+
filters = []
194+
195+
# filter by CVE IDs
196+
if opts.cveId:
197+
cve_ids = list({cve_id.upper() for cve_id in opts.cveId})
198+
filters.append(vh_table.vuln_id.in_(cve_ids))
199+
200+
# filter by change event name (regex, case-insensitive)
201+
if opts.changeEvent:
202+
filters.append(vh_table.event_name.op("~*")(opts.changeEvent))
203+
204+
# filter by change type inside details[].type (JSONB)
205+
if opts.changeType:
206+
filters.append(
207+
text(
208+
"EXISTS ("
209+
"SELECT 1 "
210+
"FROM jsonb_array_elements(COALESCE(vh_table.data->'details', '[]'::jsonb)) AS detail "
211+
"WHERE detail->>'type' ~* :ct"
212+
")"
213+
).bindparams(ct=opts.changeType)
214+
)
215+
216+
# filter by change date range
217+
if opts.lastModStartDate:
218+
start_dt = datetime.combine(opts.lastModStartDate, time.min)
219+
filters.append(vh_table.change_date >= start_dt)
220+
if opts.lastModEndDate:
221+
end_dt = datetime.combine(opts.lastModEndDate, time.max)
222+
filters.append(vh_table.change_date <= end_dt)
223+
224+
filtered_history_query = select(
225+
vh_table.vuln_id.label("vuln_id"),
226+
vh_table.change_date.label("change_date"),
227+
vh_table.data.label("data"),
228+
)
229+
if filters:
230+
filtered_history_query = filtered_history_query.where(*filters)
231+
filtered_history = filtered_history_query.cte("filtered_history").prefix_with("MATERIALIZED")
232+
233+
# 1) paginate matching CVE IDs from the filtered rowset
234+
cve_id_subquery = (
235+
select(filtered_history.c.vuln_id)
236+
.distinct()
237+
.order_by(filtered_history.c.vuln_id)
238+
.offset(opts.pageIdx * opts.pageSize)
239+
.limit(opts.pageSize)
240+
.subquery()
241+
)
242+
243+
# 2) return the matching history rows for those paginated CVEs
244+
query = (
245+
select(filtered_history.c.data)
246+
.where(filtered_history.c.vuln_id.in_(select(cve_id_subquery.c.vuln_id)))
247+
.order_by(filtered_history.c.vuln_id, filtered_history.c.change_date)
248+
)
249+
250+
rows = session.execute(query).all()
251+
252+
result = {
253+
"search": get_non_empty_opts(opts),
254+
"result": [row.data for row in rows],
255+
}
256+
257+
return result
138258

139259
# ------------------------------------------------------------------------------
140260
def get_cvss_metric_conditions(cvss_metrics: str, version:str) -> Iterator[dict]:
@@ -426,13 +546,15 @@ def search_capec(appctx: ApplicationContext, opts: SearchOptions):
426546
def search_data(appctx, opts: SearchOptions):
427547

428548
search_results = {}
549+
ensure_search_data_loaded(appctx, opts.searchInfo)
429550

430551
# search the data based on the input criterias
431552
if opts.searchInfo == SearchInfoType.status: search_results = get_fetch_status(appctx)
432553
elif opts.searchInfo == SearchInfoType.cve: search_results = search_cves(appctx, opts)
433554
elif opts.searchInfo == SearchInfoType.cpe: search_results = search_cpes(appctx, opts)
434555
elif opts.searchInfo == SearchInfoType.cwe: search_results = search_cwes(appctx, opts)
435556
elif opts.searchInfo == SearchInfoType.capec: search_results = search_capec(appctx, opts)
557+
elif opts.searchInfo == SearchInfoType.cvehist: search_results = search_cvehist(appctx, opts)
436558

437559
return search_results
438560

@@ -453,6 +575,7 @@ def results_output_id(opts: SearchOptions, search_results):
453575
SearchInfoType.cpe: 'cpeName',
454576
SearchInfoType.cwe: 'ID',
455577
SearchInfoType.capec: 'ID',
578+
SearchInfoType.cvehist: 'cveId',
456579
}
457580

458581
key = key_names_map[opts.searchInfo]

src/common/util.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
import os
88
import shutil
9+
import time
910
from alembic.config import Config
1011
from alembic import command
1112

@@ -48,6 +49,26 @@ def init_db_schema():
4849
os.chdir(cwd)
4950

5051

52+
# ------------------------------------------------------------------------------
53+
def ensure_db_schema(max_attempts=60, retry_delay=1):
54+
55+
setup_env()
56+
57+
last_exc = None
58+
for attempt in range(max_attempts):
59+
try:
60+
init_db_schema()
61+
return
62+
except Exception as exc: # pragma: no cover
63+
last_exc = exc
64+
if attempt == max_attempts - 1:
65+
break
66+
time.sleep(retry_delay)
67+
68+
if last_exc is not None: # pragma: no cover
69+
raise last_exc
70+
71+
5172
# ------------------------------------------------------------------------------
5273
def setup_env():
5374

src/config/setenv/config.ini

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ file.max.count = 10
4343
; NIST CVE API
4444
url.cve = https://services.nvd.nist.gov/rest/json/cves/2.0
4545

46+
; NIST CVE History API
47+
url.cvehist = https://services.nvd.nist.gov/rest/json/cvehistory/2.0
48+
4649
; NIST CPE API
4750
url.cpe = https://services.nvd.nist.gov/rest/json/cpes/2.0
4851

@@ -67,6 +70,7 @@ api_key = ${NVD_API_KEY}
6770
; pause between requests
6871
request.pause.with_key = 1 #seconds to pause between requests
6972
request.pause.without_key = 6 #seconds to pause between requests
73+
worker.timeout = 300 #seconds without worker progress before failing
7074

7175
; min time between syncs (sec)
7276
min.sync.time = 2 * 60 * 60 # sec
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
"""add vuln_history table
2+
3+
Revision ID: 91d0157eaab5
4+
Revises: ecd29e77afe3
5+
Create Date: 2026-01-27 12:48:44.685028
6+
7+
"""
8+
from alembic import op
9+
import sqlalchemy as sa
10+
from sqlalchemy.dialects import postgresql
11+
12+
13+
# revision identifiers, used by Alembic.
14+
revision = '91d0157eaab5'
15+
down_revision = 'ecd29e77afe3'
16+
branch_labels = None
17+
depends_on = None
18+
19+
20+
def upgrade():
21+
op.create_table(
22+
"vuln_history",
23+
sa.Column("id", sa.Integer(), primary_key=True),
24+
sa.Column("vuln_id", sa.String(length=20), nullable=False),
25+
sa.Column("change_id", sa.String(length=64), nullable=False),
26+
sa.Column("event_name", sa.String(length=64)),
27+
sa.Column("source", sa.String(length=100)),
28+
sa.Column("change_date", sa.DateTime(timezone=True)),
29+
sa.Column("data", postgresql.JSONB(), nullable=False),
30+
sa.Column(
31+
"sys_creation_date",
32+
sa.DateTime(timezone=True),
33+
server_default=sa.text("CURRENT_TIMESTAMP"),
34+
nullable=False,
35+
),
36+
)
37+
38+
op.create_index("ix_vuln_history_vuln_id", "vuln_history", ["vuln_id"])
39+
op.create_index("ix_vuln_history_change_id", "vuln_history", ["change_id"], unique=True)
40+
op.create_index("ix_vuln_history_event_name", "vuln_history", ["event_name"])
41+
op.create_index("ix_vuln_history_change_date", "vuln_history", ["change_date"])
42+
43+
op.create_index(
44+
"ix_vuln_history_data_gin",
45+
"vuln_history",
46+
["data"],
47+
postgresql_using="gin",
48+
)
49+
50+
def downgrade():
51+
op.drop_index("ix_vuln_history_data_gin", table_name="vuln_history")
52+
op.drop_index("ix_vuln_history_change_date", table_name="vuln_history")
53+
op.drop_index("ix_vuln_history_event_name", table_name="vuln_history")
54+
op.drop_index("ix_vuln_history_change_id", table_name="vuln_history")
55+
op.drop_index("ix_vuln_history_vuln_id", table_name="vuln_history")
56+
op.drop_table("vuln_history")

0 commit comments

Comments
 (0)