-
Notifications
You must be signed in to change notification settings - Fork 68
Description
When building a top-down BVH in a depth first manner the left of the left child comes after the right child:
/ \
left-> 42 43 <-right
/ \ / \
left of left-> 44 45 321 322
This means whenever we traverse down left (half of the time) we just have to read the next nodes in the array which may be in some cache already.
However the current multithreaded builds don't have this nice ordering because an atomic counter is used:
Line 2534 in 4b5b649
| if (threadedBuild) n = atomicNewNodePtr->fetch_add( 2 ); else n = newNodePtr, newNodePtr += 2; |
So we may end up with something like this instead:
/ \
left-> 42 43 <-right
/ \ / \
left of left-> 612 613 266 267
Depth first vs "atomic order" makes a small difference in my GPU PathTracer: 187fps vs 182fps.
The fix is to remove atomic counter and compute deterministic node offsets assuming 1PPL (primitive per leaf):
leftOfLeft = right + 1
leftOfRight = right + 2 * leftTriCount - 1Since that produces gaps in case 1PPL is not true a simple/fast in-place compaction step should be added afterwards.
Overall this is not a big deal and using atomic counter is simple, so might as well keep it but I just wanted to bring this up. Maybe for other people.
On a side note, there is a potential optimization opportunity I thought of: Because traversing down the left path is more memory friendly, the child with the larger surface area (more likely to be hit) should always be placed left.