feat: CSAF remediation support in api/v3/purl/recommend and api/v2/vulnerability/analyze endpoints#2234
feat: CSAF remediation support in api/v3/purl/recommend and api/v2/vulnerability/analyze endpoints#2234Strum355 wants to merge 3 commits intoguacsec:mainfrom
Conversation
Reviewer's GuideAdds CSAF remediation ingestion and exposure across the data model and APIs, including new remediation entities, migrations, and wiring them into CSAF loading, PURL analysis (v2/v3), and OpenAPI schemas. Sequence diagram for CSAF remediation ingestion flowsequenceDiagram
actor UpstreamProvider
participant CsafLoader
participant StatusCreator
participant RemediationCreator
participant DB
UpstreamProvider->>CsafLoader: load(csaf_document)
CsafLoader->>StatusCreator: new(advisory_id, vulnerability_id)
CsafLoader->>StatusCreator: add_all(product_status.*)
CsafLoader->>StatusCreator: create(graph, connection)
activate StatusCreator
StatusCreator->>DB: insert organizations, products, product_status, purl_status
StatusCreator-->>CsafLoader: HashMap~product_id, ProductIdStatusMapping~
deactivate StatusCreator
alt vulnerability_has_remediations
CsafLoader->>RemediationCreator: new(advisory_id, vulnerability_id, product_id_mapping)
loop remediations
CsafLoader->>RemediationCreator: add(remediation)
end
RemediationCreator->>DB: insert remediation rows
RemediationCreator->>DB: insert remediation_purl_status rows
RemediationCreator->>DB: insert remediation_product_status rows
RemediationCreator-->>CsafLoader: Ok
end
CsafLoader-->>UpstreamProvider: load result
ER diagram for new remediation tables and relationshipserDiagram
remediation {
uuid id PK
uuid advisory_id
string vulnerability_id
remediation_category category
string details
string url
jsonb data
}
remediation_purl_status {
uuid remediation_id PK, FK
uuid purl_status_id PK, FK
}
remediation_product_status {
uuid remediation_id PK, FK
uuid product_status_id PK, FK
}
advisory_vulnerability {
uuid advisory_id PK, FK
string vulnerability_id PK, FK
}
purl_status {
uuid id PK
uuid base_purl_id
uuid vulnerability_id
uuid status_id
uuid version_range_id
}
product_status {
uuid id PK
uuid advisory_id
string vulnerability_id
uuid product_version_range_id
}
remediation ||--o{ remediation_purl_status : links
remediation ||--o{ remediation_product_status : links
remediation_purl_status }o--|| purl_status : targets
remediation_product_status }o--|| product_status : targets
advisory_vulnerability ||--o{ remediation : owns
purl_status }o--|| advisory_vulnerability : via_vulnerability
product_status }o--|| advisory_vulnerability : via_vulnerability
Class diagram for new remediation domain model and API exposureclassDiagram
class RemediationCategory {
<<enumeration>>
+VendorFix
+Workaround
+Mitigation
+NoFixPlanned
+NoneAvailable
+WillNotFix
}
class RemediationModel {
+Uuid id
+Uuid advisory_id
+String vulnerability_id
+RemediationCategory category
+Option~String~ details
+Option~String~ url
+serde_json::Value data
}
class RemediationPurlStatusModel {
+Uuid remediation_id
+Uuid purl_status_id
}
class RemediationProductStatusModel {
+Uuid remediation_id
+Uuid product_status_id
}
class RemediationSummary {
+Uuid id
+RemediationCategory category
+Option~String~ details
+Option~String~ url
+serde_json::Value data
+from_entities(remediations: &[remediation::Model]) RemediationSummary[]
}
class VersionedPurlStatus {
+VulnerabilityHead vulnerability
+String status
+Vec~RemediationSummary~ remediations
+from_entity(vuln: vulnerability::Model, status_model: Option~status::Model~, remediations: &[remediation::Model], tx: ConnectionTrait) Result~VersionedPurlStatus, Error~
}
class AnalysisPurlStatus {
+PurlStatus purl_status
+Vec~RemediationSummary~ remediations
}
class AnalysisDetailsV3 {
+VulnerabilityHead head
+Vec~AnalysisPurlStatus~ purl_statuses
}
class VulnerabilityStatus {
+String id
+Option~VexStatus~ status
+Option~VexJustification~ justification
+Vec~RemediationSummary~ remediations
}
class PurlStatus {
}
class ProductStatusModel {
}
class AdvisoryVulnerabilityModel {
+Uuid advisory_id
+String vulnerability_id
}
RemediationModel --> RemediationCategory
RemediationModel --> AdvisoryVulnerabilityModel : belongs_to
RemediationPurlStatusModel --> RemediationModel : many_to_one
RemediationPurlStatusModel --> PurlStatus : many_to_one
RemediationProductStatusModel --> RemediationModel : many_to_one
RemediationProductStatusModel --> ProductStatusModel : many_to_one
RemediationSummary ..> RemediationModel : from_entities
VersionedPurlStatus --> RemediationSummary : uses
AnalysisPurlStatus --> RemediationSummary : has
AnalysisDetailsV3 --> AnalysisPurlStatus : has
VulnerabilityStatus --> RemediationSummary : has
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
RemediationCreator::generate_remediation_uuidyou base the UUID onformat!("{:?}", rem.category), which depends on theDebugrepresentation; consider using a stable string value (e.g. the serialized enum string or DB value) to avoid UUID changes if the enum’sDebugoutput ever changes. - The
RemediationCategoryActiveEnum and migration both define awill_not_fixvariant, but theFrom<&csaf::vulnerability::RemediationCategory> for RemediationCategoryimpl does not handle this case (comment says CSAF 2.1 not supported); if/when the upstream library adds it, this will silently map to nothing, so it may be worth adding a placeholder arm or explicit TODO to avoid future mismatches.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `RemediationCreator::generate_remediation_uuid` you base the UUID on `format!("{:?}", rem.category)`, which depends on the `Debug` representation; consider using a stable string value (e.g. the serialized enum string or DB value) to avoid UUID changes if the enum’s `Debug` output ever changes.
- The `RemediationCategory` ActiveEnum and migration both define a `will_not_fix` variant, but the `From<&csaf::vulnerability::RemediationCategory> for RemediationCategory` impl does not handle this case (comment says CSAF 2.1 not supported); if/when the upstream library adds it, this will silently map to nothing, so it may be worth adding a placeholder arm or explicit TODO to avoid future mismatches.
## Individual Comments
### Comment 1
<location> `modules/fundamental/src/vulnerability/service/mod.rs:477-481` </location>
<code_context>
INNER JOIN version_range ON purl_status.version_range_id = version_range.id
LEFT JOIN vulnerability ON purl_status.vulnerability_id = vulnerability.id
INNER JOIN status ON purl_status.status_id = status.id
+ LEFT JOIN remediation_purl_status rps ON rps.purl_status_id = purl_status.id
+ LEFT JOIN remediation ON remediation.id = rps.remediation_id
WHERE {ns_condition}
</code_context>
<issue_to_address>
**issue (bug_risk):** Remediations get associated per advisory, not per purl status, which can misrepresent purl-specific remediations.
The SQL correctly joins `remediation_purl_status` via `purl_status.id`, but the aggregation later stores remediations only in `remediations_by_advisory: HashMap<Uuid, Vec<RemediationSummary>>`, keyed by `advisory_id`. As a result, every `AnalysisPurlStatus` for an advisory gets the same remediation list, even though `remediation_purl_status` is scoped to specific `purl_status` rows.
If remediations are meant to be purl-specific, consider keying by `purl_status` (or `(advisory_id, purl_status_id)`) and attaching only the corresponding remediations to each `AnalysisPurlStatus` instead of sharing them across the advisory.
</issue_to_address>
### Comment 2
<location> `modules/ingestor/src/service/advisory/csaf/creator.rs:483` </location>
<code_context>
+ fn generate_remediation_uuid(&self, rem: &Remediation) -> Uuid {
+ let mut result = Uuid::new_v5(&REMEDIATION_NAMESPACE, self.advisory_id.as_bytes());
+ result = Uuid::new_v5(&result, self.vulnerability_id.as_bytes());
+ result = Uuid::new_v5(&result, format!("{:?}", rem.category).as_bytes());
+ result = Uuid::new_v5(&result, rem.details.as_bytes());
+ if let Some(url) = &rem.url {
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Using `Debug` formatting of the remediation category in UUID generation is brittle and may change if the upstream enum implementation changes.
Deriving the UUID from `format!("{:?}", rem.category)` ties its stability to the `Debug` implementation of `csaf::RemediationCategory`, which can change (e.g., variant rename or formatting tweak), silently changing UUIDs and breaking idempotency. Prefer a stable representation instead, such as first mapping to your own `RemediationCategory` and using its database string (or another explicitly maintained string mapping) as the `new_v5` input.
Suggested implementation:
```rust
let mut result = Uuid::new_v5(&REMEDIATION_NAMESPACE, self.advisory_id.as_bytes());
result = Uuid::new_v5(&result, self.vulnerability_id.as_bytes());
// Use a stable, non-Debug string representation of the remediation category
// so that UUIDs do not change with Debug formatting tweaks.
let category_key = rem.category.to_string();
result = Uuid::new_v5(&result, category_key.as_bytes());
result = Uuid::new_v5(&result, rem.details.as_bytes());
```
If your codebase already defines an internal, stable representation of remediation categories (e.g., your own `RemediationCategory` enum or a DB string mapping), you can further improve stability by:
1. Mapping `rem.category` to that internal representation, and
2. Using that explicit mapping as the `category_key` instead of `rem.category.to_string()`.
For example:
- Introduce a helper like `fn remediation_category_key(rem: &Remediation) -> &'static str` that returns your own stable strings.
- Replace `let category_key = rem.category.to_string();` with `let category_key = remediation_category_key(rem);`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
dejanb
left a comment
There was a problem hiding this comment.
This is going in a good direction. Thanks @Strum355
I generally agree with sourcery comments, so let's try to address them
In general, I think we should add more tests to cover cases like this
I opened #2236 for the next step in supporting product statuses, which should provide us with more realistic test data. But even with current synthetic data we should cover multiple purls-advisories combinations to make sure that queries and APIs cover them properly.
…and other review comments
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2234 +/- ##
==========================================
+ Coverage 69.21% 69.25% +0.03%
==========================================
Files 405 410 +5
Lines 23038 23197 +159
Branches 23038 23197 +159
==========================================
+ Hits 15946 16065 +119
- Misses 6188 6223 +35
- Partials 904 909 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary by Sourcery
Add CSAF remediation ingestion, persistence, and exposure across APIs, enriching vulnerability and purl analysis with remediation metadata.
New Features:
Enhancements:
Tests: