-
Notifications
You must be signed in to change notification settings - Fork 9
Deduplicate Backlog Items #708
Description
Motivation
Increasing amount of backlog items does not scale well, as this floods the etcd.
Especially in spike situations, e.g. new product component release, the scan workers have a lot on the plate.
We see situations where artefact-enumerator re-runs faster than workers can process backlog items.
This leads to situations where multiple backlog-items for the very same scanner and artefact exist.
They add no value as the workers will process the first, and then skip (in most situations) subsequent scans. Due to the aforementioned bad scaling behaviour and the fact that it is causing lot of noise in the cluster (and for operators), let's consider a concept to deduplicate backlog-items.
Proposals
-
We should avoid putting too much load on the API-server, we should consider the amount of requests as most important efficiency metric. Using list operation with
labelselector will result in only one request, as the labels are part of resource metadata and filtering is done server-side. -
Use digest-based backlog item names
Metadata
Metadata
Assignees
Labels
Type
Projects
Status