Don't create new containers for existing CVEs

Technically this isn't a bug but tech debt (i.e., known in advance that it may become a problem, and solving it was postponed), but we're now observing a degradation in user experience because of it.

We're currently creating new containers in the database even when the CVE ID already exists:

https://github.com/NixOS/nix-security-tracker/blob/8e3c303d74e9cfe29f6f4e897627f1a39357e74c/src/shared/fetchers.py#L287-L289

This leads to new matches being triggered for arbitrarily old CVEs, because those happen on container insertion. Recently upstream decided to add [microsecond precision (!) to publication dates retroactively](https://github.com/CVEProject/cvelistV5/commit/56e733b5b94e45193377f621a3572482c12cbed0), which produced >2k redundant items.

We should instead update our existing data in such a case.

# Note

We may want to consider dropping the custom data model and ingest [JSON into Postgres](https://www.postgresql.org/docs/current/functions-json.html) directly instead. We can still have structured data in application code using [the upstream schema](https://github.com/CVEProject/cve-schema) with [generated Pydantic models](https://github.com/koxudaxi/datamodel-code-generator/). And at the moment we're processing each CVE separately and only once anyway, so there should be no issue with querying aggregate data. Such a change would require quite a bit of rewiring though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't create new containers for existing CVEs #812

Note

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	if record is not None:
	# TODO: Remove stale data to prevent overgrowth
	pass

Don't create new containers for existing CVEs #812

Description

Note

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions