-
Notifications
You must be signed in to change notification settings - Fork 502
[rapid7_insightvm] Add asset_vulnerability data stream for Cloud Detection and Response (CDR) workflow #14079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rapid7_insightvm] Add asset_vulnerability data stream for Cloud Detection and Response (CDR) workflow #14079
Conversation
🚀 Benchmarks reportTo see the full report comment with |
packages/rapid7_insightvm/elasticsearch/transform/latest_cdr_vulnerabilities/transform.yml
Outdated
Show resolved
Hide resolved
|
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
| tag: set_vulnerability_enumeration | ||
| value: CVE | ||
| # Remove cloud.* fields populated by beat. | ||
| # These fields correspond to EA rather than Tenable hosts and could be misleading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment mentions Tenable, should be Rapid7 I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 9bd0b7b
| template_path: cel.yml.hbs | ||
| enabled: false | ||
| vars: | ||
| - name: interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no initial interval as far as I can see. Do we do full scan periodically? In that case we need to set the @timestamp of the documents to the ingestion time, like we did for AWS SecurityHub full posture data stream https://github.com/elastic/integrations/pull/13372/files#diff-2aebbf76c4f4587a0d3701264a03075f4886df078bb04d4b41e40ec329902488R592 This way we won't have the problem of old documents outside of the retention period
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Rapid7, it will collect all the vulnerabilities present in all the assets during the first interval. After subsequent intervals, it will collect only the new and remediated vulnerabilities that have introduced since the last interval.
cc: @kcreddy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxcold, we are going with incremental load to keep lower memory footprint as @brijesh-elastic observed agentless agent restarts in Serverless environments.
But I see your point. We need to match the transform retention with initial_interval, otherwise even if we ingest 2 years worth of data if our transform retention is much lower, all the documents are deleted inside destination indices. Is this correct understanding?
If this is the case, we can add initial_interval like with other incremental integrations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we go with incremental I think we need to follow other integraitons approach and have initial_interval. It will at least give users the option to ingest some initial data for the CDR flows. But I'm not sure I understand how do we then have the data from 2023 in the envs if you do incremental updates only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'm not sure I understand how do we then have the data from 2023 in the envs if you do incremental updates only?
In the absence of initial_interval, the first fetch ingests all historical data. All fetches after this will be incremental.
@brijesh-elastic is already working on adding initial_interval option, so that first fetch only has user-defined initial interval data instead of ingesting all historical data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested cloud security UIs with the changes, everything works well as far as i can tell
packages/rapid7_insightvm/data_stream/asset/elasticsearch/ingest_pipeline/default.yml
Show resolved
Hide resolved
packages/rapid7_insightvm/data_stream/asset_vulnerability/manifest.yml
Outdated
Show resolved
Hide resolved
packages/rapid7_insightvm/data_stream/asset_vulnerability/agent/stream/cel.yml.hbs
Show resolved
Hide resolved
...ata_stream/asset_vulnerability/_dev/test/pipeline/test-asset-vulnerability.log-expected.json
Outdated
Show resolved
Hide resolved
...ata_stream/asset_vulnerability/_dev/test/pipeline/test-asset-vulnerability.log-expected.json
Outdated
Show resolved
Hide resolved
packages/rapid7_insightvm/data_stream/asset_vulnerability/agent/stream/cel.yml.hbs
Outdated
Show resolved
Hide resolved
packages/rapid7_insightvm/data_stream/asset_vulnerability/agent/stream/cel.yml.hbs
Outdated
Show resolved
Hide resolved
packages/rapid7_insightvm/data_stream/asset_vulnerability/agent/stream/cel.yml.hbs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the behaviour is better, but I still have concerns about #14079 (comment).
packages/rapid7_insightvm/data_stream/asset_vulnerability/agent/stream/cel.yml.hbs
Outdated
Show resolved
Hide resolved
...ata_stream/asset_vulnerability/_dev/test/pipeline/test-asset-vulnerability.log-expected.json
Show resolved
Hide resolved
...ata_stream/asset_vulnerability/_dev/test/pipeline/test-asset-vulnerability.log-expected.json
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding comments I added, LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit only. LGTM for my comments 👍🏼 . Please wait for @efd6 approval.
| - remove: | ||
| field: | ||
| - organization | ||
| - division | ||
| - team | ||
| ignore_missing: true | ||
| if: ctx.organization instanceof String && ctx.division instanceof String && ctx.team instanceof String | ||
| tag: remove_agentless_tags | ||
| description: >- | ||
| Removes the fields added by Agentless as metadata, | ||
| as they can collide with ECS fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a changelog entry for this change (bugfix) similar to #14172
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Nit only, then LGTM
| ?"asset": has(state.?cursor.last_interval_time) ? | ||
| optional.none() | ||
| : | ||
| optional.of(("last_scan_end > " + string((timestamp(interval_time) - duration(state.initial_interval)).format(time_layout.RFC3339)))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| optional.of(("last_scan_end > " + string((timestamp(interval_time) - duration(state.initial_interval)).format(time_layout.RFC3339)))), | |
| optional.of(("last_scan_end > " + (timestamp(interval_time) - duration(state.initial_interval)).format(time_layout.RFC3339))), |
timestamp.format returns a string.
💚 Build Succeeded
History
|
|
|
Package rapid7_insightvm - 2.0.0 containing this change is available at https://epr.elastic.co/package/rapid7_insightvm/2.0.0/ |


Proposed commit message
Note
To Reviewers:
Checklist
changelog.ymlfile.How to test this PR locally
Related issues