Skip to content

Multithreaded builds have less memory efficient node ordering #241

@BoyBaykiller

Description

@BoyBaykiller

When building a top-down BVH in a depth first manner the left of the left child comes after the right child:

                   /     \
           left-> 42      43 <-right
                 / \      / \ 
left of left-> 44   45   321 322

This means whenever we traverse down left (half of the time) we just have to read the next nodes in the array which may be in some cache already.

However the current multithreaded builds don't have this nice ordering because an atomic counter is used:

if (threadedBuild) n = atomicNewNodePtr->fetch_add( 2 ); else n = newNodePtr, newNodePtr += 2;

So we may end up with something like this instead:

                   /     \
           left-> 42      43 <-right
                 / \      / \ 
left of left-> 612 613  266 267

Depth first vs "atomic order" makes a small difference in my GPU PathTracer: 187fps vs 182fps.

The fix is to remove atomic counter and compute deterministic node offsets assuming 1PPL (primitive per leaf):

leftOfLeft = right + 1
leftOfRight = right + 2 * leftTriCount - 1

Since that produces gaps in case 1PPL is not true a simple/fast in-place compaction step should be added afterwards.

Overall this is not a big deal and using atomic counter is simple, so might as well keep it but I just wanted to bring this up. Maybe for other people.

On a side note, there is a potential optimization opportunity I thought of: Because traversing down the left path is more memory friendly, the child with the larger surface area (more likely to be hit) should always be placed left.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions