Skip to content

Bounded memory sorting #32

@facundominguez

Description

@facundominguez

While writing an analysis tool, the need for sorting the events in an eventlog file arose. The problem with the existing GHC.RTS.Events.sortEvents is that it is not memory bounded, leading to heap exhaustion on large inputs.

How about we provide a memory bounded version?

I ended up writing something similar to this. It splits events per capability to different temporary files, and then merges the events back into a single list.

I could submit a PR if it looks useful to others.

This version is a bit wasteful in that it has to deserialize and serialize all the events, where for the sake of splitting the events, only the capability number and the encoded form of the event would be needed.

Another odd aspect is that I couldn't make it stream when using Data.ByteString.Builder to write the events to the temporary files. Heap would be exhausted the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions