Skip to content

Notes from getting things working at FNAL #58

@lgray

Description

@lgray

Finding the direction of a solution required fixing up #51 and #52 as well as making TBranches much lazier objects, and some additional moving around of when TFiles are opened, etc.

All the edits to code, not so many in the end, are here (which is based on laurelin 0.3.0):
https://github.com/spark-root/laurelin/compare/master...lgray:topic_scaleout_and_laziness?expand=1

I think there's still more to gain but this yielded a nice 2x improvement on processing in a 24GB flat ntuple in our analysis and reduces "thread-joins" in the higher level spark processing workflow. Performance compared to the Vandy cluster on root will need to be established to understand things under some sort of baseline. I'm pretty sure more than 2x is possible.

Please note this (my) code is horrible and not at all well optimized, but is meant as an attempt to get things in the right places/shapes.

There's one more exception I need to follow up, somehow it's finding non-monotonic basket entries but I thought I got that threaded through OK. This last exception has been fixed wasn't properly dealing with empty baskets you see in some files.

I'll post some notes on laurelin master not working tomorrow or Monday. That one seems to be truncated arrays or something.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions