Skip to content

OOME when there are large amounts of validation reports #42

@RickMoynihan

Description

@RickMoynihan

e.g. running

$ clojure -M:pmd-qb:validate -e https://staging.gss-data.org.uk/sparql

In the pmd-rdf-validations repo.

I think one reason this may occur is because of this:

(defn run-test-cases [test-cases query-variables endpoint]
(map #(run-test-case % query-variables endpoint) test-cases))

I suspect Clojure's batching of lazy seqs into 32 items might cause a space leak here if a test-case returns a large lazy sequence of test case errors then it may not be released. Out of curiousity I tried hacking this by putting a (take 1000) over the results inside the with-open which eagerly loads data into a vector, however it seemed the leak was still occuring, so I suspect there may be other issues here.

It may be worth rewriting this code to be eager using transducers, or to put a configurable limit over the amount of test failures to report on for any given test case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions