Skip to content

Render large JSON files in a memory-efficient way #7

@bentsherman

Description

@bentsherman

The usual pattern to render a JSON file is to create the equivalent data structure in Groovy code, render it to a JSON string, and write the entire string to a file. For large runs with thousands of tasks, the JSON string could be quite large and cause Nextflow to run out of memory.

First we need to evaluate whether this is actually a real problem. Do some large runs and see how large the resulting prov reports are. If they get into the 100 MB - 1 GB range, then we should probably optimize the rendering code.

The memory-efficient approach is to write the JSON output directly and save it to the file in pieces, so that we never have to allocate the entire report in memory and the memory usage does not increase with the number of tasks / outputs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions