Skip to content
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions release-content/release-notes/meshlet-bvh-culling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: Meshlet BVH culling
authors: ["@SparkyPotato", "@atlv24"]
pull_requests: [19318]
---

(TODO: Embed example screenshot here)

Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU.
Bevy's virtual geometry has been greatly optimized with BVH-based culling, making the cost of rendering nearly independent of scene geometry. (Then write something here about 120k vs 1 million instances).


This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles).
This also gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles).

There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented),
and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented),
and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame.
There are now *no* hardcoded limits to scene size. In practice you will only be limited by asset VRAM usage (since streaming is not yet implemented),
and total instance count due the current code requiring all instances to be re-uploaded to the GPU every frame.


The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene.
However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this sentence would mean much to people not very familiar with the rendering code.


Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation.
Speaking of GPU cost, the scene above renders in about 3.5 ms on a 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the 4070 + 1440p combo mentioned above

Increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with ~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms.
Loading