From fcf2d02e661bb17ee2dd2d16bad1587e96c23eea Mon Sep 17 00:00:00 2001 From: SparkyPotato Date: Tue, 12 Aug 2025 00:54:33 +0100 Subject: [PATCH 1/8] release notes --- .../release-notes/meshlet-bvh-culling.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) create mode 100644 release-content/release-notes/meshlet-bvh-culling.md diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md new file mode 100644 index 0000000000000..d5c323004d402 --- /dev/null +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -0,0 +1,17 @@ +--- +title: Meshlet BVH culling +authors: ["@SparkyPotato", "@atlv24"] +pull_requests: [19318] +--- + +Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU. + +This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). +There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented), +and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame. + +The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. +However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time). + +Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. +Increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with ~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. From 8f3e2a56b4a3e466a515763c09d0237eb6c8e759 Mon Sep 17 00:00:00 2001 From: SparkyPotato Date: Tue, 12 Aug 2025 01:03:35 +0100 Subject: [PATCH 2/8] fix --- release-content/release-notes/meshlet-bvh-culling.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index d5c323004d402..a9a91427d457e 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -4,10 +4,12 @@ authors: ["@SparkyPotato", "@atlv24"] pull_requests: [19318] --- +(TODO: Embed example screenshot here) + Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU. -This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). -There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented), +This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). +There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented), and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame. The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. From a379b428038b44398534df1a0f9cc2e1d97558f6 Mon Sep 17 00:00:00 2001 From: Shaye Garg <64652557+SparkyPotato@users.noreply.github.com> Date: Tue, 12 Aug 2025 01:08:53 +0100 Subject: [PATCH 3/8] fix Co-authored-by: atlv --- release-content/release-notes/meshlet-bvh-culling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index a9a91427d457e..dec36d8109cd0 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -16,4 +16,4 @@ The screenshot above has 130,000 dragons in the scene, each with about 870,000 t However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time). Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. -Increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with ~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. +After increasing the instance count to over 1 million (almost *900 billion triangles*!), the GPU time increases to about 4.5 ms, with ~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. From ca90f06ebf0d74203dae06c326bb53e3e97a496b Mon Sep 17 00:00:00 2001 From: Shaye Garg <64652557+SparkyPotato@users.noreply.github.com> Date: Tue, 12 Aug 2025 01:10:49 +0100 Subject: [PATCH 4/8] rename Co-authored-by: JMS55 <47158642+JMS55@users.noreply.github.com> --- release-content/release-notes/meshlet-bvh-culling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index dec36d8109cd0..5f6418362aa56 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -1,5 +1,5 @@ --- -title: Meshlet BVH culling +title: Virtual Geometry BVH culling authors: ["@SparkyPotato", "@atlv24"] pull_requests: [19318] --- From af037c0e14b5c753a17744c7ec997c7a61ec51bc Mon Sep 17 00:00:00 2001 From: SparkyPotato Date: Tue, 12 Aug 2025 21:59:27 +0100 Subject: [PATCH 5/8] add comparison --- .../release-notes/meshlet-bvh-culling.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index a9a91427d457e..7b700e942fe0c 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -6,14 +6,18 @@ pull_requests: [19318] (TODO: Embed example screenshot here) -Bevy's virtual geometry has been greatly optimized with BVH-based culling, leading to almost true scene-complexity invariance on the GPU. +Bevy's virtual geometry has been greatly optimized with BVH-based culling, making the cost of rendering nearly independent of scene geometry. +Comparing the sample scene show above with 130k dragon instances to one with over 1 million instances, total GPU rendering time only increases by 30%. -This gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). -There are now *no* hardcoded limits to scene size, only unique instance limits due to VRAM usage (since streaming is not yet implemented), -and total instance limits due the current architecture requiring all instances to be uploaded to the GPU every frame. +This also gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). +There are now *no* hardcoded limits to scene size. In practice you will only be limited by asset VRAM usage (since streaming is not yet implemented), +and total instance count due the current code requiring all instances to be re-uploaded to the GPU every frame. The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. -However, this still runs at 60 fps on an RTX 4070 at 1440p, with most of the time being due to the instance upload CPU bottleneck mentioned above (taking 14 ms of CPU time). +However, this still runs at 60 fps on an RTX 4070 at 1440p, but only due to the instance re-upload mentioned above. -Speaking of GPU cost, the scene above renders in about 3.5 ms on the 4070, with ~3.1 ms being spent on the geometry render and ~0.4 ms on the material evaluation. -Increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with ~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. +Speaking of concrete GPU cost, the scene above renders in about 3.5 ms on the 4070, with \~3.1 ms being spent on the geometry render and \~0.4 ms on the material evaluation. +Increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with \~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. +This is a 30% increase in GPU time for an almost 8x increase in scene complexity. + +Comparing GPU times to 0.16 on a much smaller scene with 1,300 instances, previously the full render took 2.2 ms, whereas now it is 1.3 ms. From f485290d54a22fb430dd3775cc2e319499485c56 Mon Sep 17 00:00:00 2001 From: SparkyPotato Date: Tue, 12 Aug 2025 22:12:45 +0100 Subject: [PATCH 6/8] fix --- release-content/release-notes/meshlet-bvh-culling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index a6127bf47c726..ba2f650ddb3c3 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -6,7 +6,7 @@ pull_requests: [19318] (TODO: Embed example screenshot here) -Bevy's virtual geometry has been greatly optimized with BVH-based culling, making the cost of rendering nearly independent of scene geometry. +Bevy's virtual geometry has been greatly optimized with BVH-based culling, making the cost of rendering nearly independent of scene geometry. Comparing the sample scene show above with 130k dragon instances to one with over 1 million instances, total GPU rendering time only increases by 30%. This also gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). From ddd799f3240acbe87c858b8a4287404eaebd1b66 Mon Sep 17 00:00:00 2001 From: SparkyPotato Date: Tue, 12 Aug 2025 23:21:03 +0100 Subject: [PATCH 7/8] remove needless framerate --- release-content/release-notes/meshlet-bvh-culling.md | 1 - 1 file changed, 1 deletion(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index ba2f650ddb3c3..a0306ac25c6aa 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -14,7 +14,6 @@ There are now *no* hardcoded limits to scene size. In practice you will only be and total instance count due the current code requiring all instances to be re-uploaded to the GPU every frame. The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. -However, this still runs at 60 fps on an RTX 4070 at 1440p, but only due to the instance re-upload mentioned above. Speaking of concrete GPU cost, the scene above renders in about 3.5 ms on the 4070, with \~3.1 ms being spent on the geometry render and \~0.4 ms on the material evaluation. After increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with \~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. From 1dbaef2c8988e082dfe3bdb378f8871b6f07fa19 Mon Sep 17 00:00:00 2001 From: Alice Cecile Date: Thu, 14 Aug 2025 13:51:03 -0400 Subject: [PATCH 8/8] Copyedits Co-authored-by: atlv --- release-content/release-notes/meshlet-bvh-culling.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/release-content/release-notes/meshlet-bvh-culling.md b/release-content/release-notes/meshlet-bvh-culling.md index a0306ac25c6aa..7a3cc6adbca76 100644 --- a/release-content/release-notes/meshlet-bvh-culling.md +++ b/release-content/release-notes/meshlet-bvh-culling.md @@ -7,7 +7,7 @@ pull_requests: [19318] (TODO: Embed example screenshot here) Bevy's virtual geometry has been greatly optimized with BVH-based culling, making the cost of rendering nearly independent of scene geometry. -Comparing the sample scene show above with 130k dragon instances to one with over 1 million instances, total GPU rendering time only increases by 30%. +Comparing the sample scene shown above with 130k dragon instances to one with over 1 million instances, total GPU rendering time only increases by 30%. This also gets rid of the previous cluster limit that limited the world to 2^24 clusters (about 4 billion triangles). There are now *no* hardcoded limits to scene size. In practice you will only be limited by asset VRAM usage (since streaming is not yet implemented), @@ -16,7 +16,7 @@ and total instance count due the current code requiring all instances to be re-u The screenshot above has 130,000 dragons in the scene, each with about 870,000 triangles, leading to over *115 billion* total triangles in the scene. Speaking of concrete GPU cost, the scene above renders in about 3.5 ms on the 4070, with \~3.1 ms being spent on the geometry render and \~0.4 ms on the material evaluation. -After increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increase to about 4.5 ms, with \~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. +After increasing the instance count to over 1 million (almost *900 billion triangles*!), the total increases to about 4.5 ms, with \~4.1 ms on geometry render and material evaluation remaining constant at ~0.4 ms. This is a 30% increase in GPU time for an almost 8x increase in scene complexity. Comparing GPU times to 0.16 on a much smaller scene with 1,300 instances, previously the full render took 2.2 ms, whereas now it is 1.3 ms.