Skip to content

Add streaming support for NGSIEM GetLookupFile downloads #597

@mraible

Description

@mraible

Feature Request

When downloading large NGSIEM lookup files (up to 200MB), it would be helpful to have streaming support to avoid loading the entire file into memory.

Current Behavior

The GetLookupFile method in falcon/client/ngsiem/ returns a GetLookupFileOK struct that only contains headers (trace ID, rate limits) but no Payload field for the actual file content:

type GetLookupFileOK struct {
    XCSTRACEID string
    XRateLimitLimit int64
    XRateLimitRemaining int64
    // No Payload field for file content
}

This makes it difficult to use the SDK for downloading lookup files. Currently, I have to make direct HTTP requests to the API endpoint instead of using the SDK.

Requested Feature

Add streaming support for GetLookupFile, similar to how FalconPy handles this. In FalconPy, you can pass stream=True to get a streaming response:

from falconpy import NGSIEM

ngsiem = NGSIEM()
response = ngsiem.get_file(
    repository="my-repo",
    filename="my-lookup.csv",
    stream=True  # Returns a streaming Response object
)

# Stream to disk with O(1) memory usage
with open("output.csv", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        if chunk:
            f.write(chunk)

Reference: https://github.com/CrowdStrike/falconpy/blob/main/src/falconpy/ngsiem.py#L108

Suggested Implementation

One approach could be to add an io.Writer parameter option that streams the response directly to the writer:

// Option 1: Writer parameter
func (a *Client) GetLookupFileToWriter(params *GetLookupFileParams, writer io.Writer, opts ...ClientOption) (*GetLookupFileOK, error)

// Option 2: Return io.ReadCloser
func (a *Client) GetLookupFileStream(params *GetLookupFileParams, opts ...ClientOption) (io.ReadCloser, *GetLookupFileOK, error)

Use Case

I'm building a Foundry function that syncs threat intelligence IOCs to NGSIEM lookup files. These files can grow to 100-200MB, and without streaming support, I have to bypass the SDK and make direct HTTP calls to avoid OOM issues.

Thank you for considering this enhancement!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions