Skip to content

[mimecast.siem_logs] Refactor to reduce memory pressure #16022

@andrewkroh

Description

@andrewkroh

The Mimecast integration's CEL-based siem_log data stream can cause significant memory pressure, potentially leading to Elastic Agent OOM restarts. This is because the current implementation downloads and processes all available event batches from the Mimecast API within a single execution cycle. When a large number of events are available, this can lead to a spike in memory usage.

Version: Mimecast 3.2.1

Problem

The CEL program for the siem_logs data stream does the following in a single execution:

  1. Fetches a list of batch URLs from the Mimecast API (/siem/v1/batch/events/cg).
  2. Downloads the gzipped NDJSON files from all of these URLs.
  3. Processes all the events from these files.

If the API returns many URLs, or the files are large, holding all the events in memory for processing can lead to memory pressure related issues (stalls, OOM kills, unresponsiveness to heartbeats.)

Proposed Solution

To mitigate this, the CEL program should be refactored to process only one batch file per execution. This can be achieved by introducing a new work list in the cursor (e.g., cursor.blobs).

The refactored logic would be:

  1. If the cursor.blobs work list is not empty, take the first URL from the list, download and process that single batch file.
  2. If cursor.blobs is empty, call the Mimecast API to get a new list of batch URLs and populate the cursor.blobs list with them. Then, process the first URL in the list.

This change will significantly reduce the memory footprint of the integration by ensuring that only one batch of events is processed at a time.

Proof of Concept

A proof-of-concept implementation of this approach is available in this branch: https://github.com/andrewkroh/integrations/commits/mimecast-siem-batch-per-execution/

Additional Context

A temporary workaround of making the page_size configurable was implemented in PR elastic/integrations#15942. However, this is not an ideal long-term solution as it can lead to inefficient use of API tokens. This issue is for tracking the implementation of the more robust, long-term solution.

Metadata

Metadata

Assignees

Labels

Integration:mimecastMimecast (Partner supported)Team:Security-Service IntegrationsSecurity Service Integrations team [elastic/security-service-integrations]bugSomething isn't working, use only for issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions