WebGPU Support by gkjohnson · Pull Request #713 · gkjohnson/three-gpu-pathtracer

gkjohnson · 2026-02-04T13:10:17Z

I've branched from #705 and changed things around quite a bit to address the storage buffer limitations by using storage textures and organized kernels into dedicated classes. So canvas resize etc all works. I have also separated the "MegaKernal" from the "PathTracerCore" so it's easier it will be easier to follow the differences and dependencies between the implementations.

Next I'm going to look into some of the ideas around a ray queue we'd discussed previously. Then we can try some timing to see how things pan out.

Relatedly, this write up will be interesting for a wave front path tracer:

https://developer.blender.org/docs/features/cycles/kernel_scheduling/

TODO

Fix creating new kernels every loop, causing GC issues
Improve flashing (ensure at least one full pass is complete)
Adjust queue sizes based on needs
Add "PathTracerBackend"

Plans

Add variance detection
Add "completion" detection
Add scene bvh + geometry utility
Design WebGPUPathtracer API
Add exports to package.json
Add "debug" views (sample count, completion visualization, etc)

…h timings.

…nu at the bottom.

… texture

# Conflicts: # package-lock.json # package.json

src/webgpu/compute/wavefront/PrimeRayGenerationDispatchKernel.js

src/webgpu/WaveFrontPathTracer.js

src/webgpu/compute/wavefront/RayIntersectionKernel.js

src/webgpu/WaveFrontPathTracer.js

src/webgpu/compute/wavefront/RayIntersectionKernel.js

gkjohnson · 2026-02-07T04:03:18Z

@TheBlek - I'm going to call this "done" as a first pass, for now. There are some workarounds of three.js issues which are marked in TODOs but it's working fairly well. One of the features I'm liking the most about it is how scalable it is - we can reduce the amount of rays processed per frame based on framerate and the page can remain responsive since the whole 7+ bounce path doesn't need to finish in a single pass. Curious to hear your thoughts.

The overall approach works like so:

Iterate over all pixels in a tiled format and push rays to trace onto a ring buffer work queue. We only iterate over the tile if there's enough space in the queue to add rays for all pixels in the tile (even though in practice we may be skipping some). Rays that have been added to the queue have their pixels marked as "active" to avoid multiple rays for the same pixel to the queue. We also issue a compute call for every tile but use indirect dispatch buffers to "cancel" unneeded generation when the queue has become full.
Trace rays in the work queue against the BVH. If there is no hit then accumulate the color in the final target buffer, increment the sample count, and mark the pixel as "inactive". If it does hit then add it to the "hitQueue". Then increment the ray queue ring buffer head pointer forward .
Process the hits. If we have reached the maximum bounce count then terminate the ray, mark the pixel as inactive, and increment the sample. Otherwise add a scatter ray back to the ray queue. Then go back to step 1 to "top up" the queue with any inactive pixels and start again.

--

A few things that need to be considered or added to aid with performance at some point:

Add support for a maximum sample count to prevent adding and working on rays for pixels that will have "finished" more quickly.
We'll want some method for detecting that at a minimum X samples across the image have finished so that we can determine when it's ready to show and avoid the partially-finished rendering. Probably with a simple compute buffer that checks all pixels and writes a storage buffer we can read back if a pixel has not passed the threshold.
Adding some kind of "convergence detection" using a minimum sample count and tracking variance of the samples. This will let pixels be marked as "completed" early on if it converges early (diffuse surfaces, unlit surfaces, background, etc) so we can skip rays for these cases and focus on pixels that need more rays and samples to converge.
Related to the above point: we'll eventually get to a point where we only have a few hundred pixels or less left to process at which point it would be best to dispatch multiple rays per pixel and we'll need to handle the race condition of rays writing to the same pixel. This will probably involve adding a special kernel that can help resolve multiple rays writing to the same pixel.

--

I'll wait to see where you're going before putting too much more work into this path tracing logic specifically. I may look at some of the other points I mentioned in #705 (comment) when I have time.

Remove usages of storage textures <rgba32float, read_write>

WebGPU: update three.js, remove unnecessary compute kernel creation

TheBlek and others added 30 commits October 16, 2025 20:07

Make basic raytracing work. Barebones proto

50139b2

Remove unused temp geometry, cleanup

c761e9a

Pass basic material information to compute shader

132139e

Add randomness, accumulate color over multiple bounces, lambert brdf

3fbb983

Get working basic version of accumulation

ebf4b1d

Fix cosine sampling pdf value

394d17e

Fix getVertexAttribute missing from dependencies. Add console.log wit…

a4df761

…h timings.

Written basic scaffolding for wavefront pathtracer. Added a little me…

7bf3fc4

…nu at the bottom.

Use a buffer to store intermediate state for pathtracing instead of a…

de42afc

… texture

Display total sample count in the bottom

c0b3a08

Create a rough outline for wavefront workflow

577df8e

Move to 0.9.2; write all the needed kernels. tsl is broken for now.

2a5a756

Fix all compilation error except for the buffer size.

401a6a0

Pack HitResult to fit into 128mb buffer

bdd9dfd

Fix wavefront demo. It works now. No cleanup

f0401be

Create new computeKernel nodes on geometry change.

aef28fe

Cleanup.

5b15ae7

Restore PhysicalPathTracingMaterial

4a50400

Merge remote-tracking branch 'origin/main' into webgpuPrototype

cd71365

# Conflicts: # package-lock.json # package.json

Move webgpu shaders into separate files.

7774d04

Use three.js's compute indirect feature. Add more comments.

0bb6557

Make rng state global.

f1c51bc

Rename reset node.

7422510

Fix rng generator state in wavefront case.

e34c73a

Move PI to a constants node

0a47c1d

Support tiled rendering.

247b0d4

Measure things property

32774d0

Better timings display and access

22d768d

Add a chart for sample time on the bottom for better insights

48a6de2

Use wavefront pathtracing by default.

9d61482

github-advanced-security bot found potential problems Feb 5, 2026

View reviewed changes

src/webgpu/compute/wavefront/PrimeRayGenerationDispatchKernel.js Fixed Show fixed Hide fixed

TheBlek mentioned this pull request Feb 5, 2026

[WIP] Webgpu pathtracer prototype #705

Draft

5 tasks

Get basic cyclic buffer working

6858e78

github-advanced-security bot found potential problems Feb 6, 2026

View reviewed changes

src/webgpu/WaveFrontPathTracer.js Fixed Show fixed Hide fixed

src/webgpu/WaveFrontPathTracer.js Dismissed Show dismissed Hide dismissed

gkjohnson added 4 commits February 6, 2026 11:29

Set up environment handling

1972176

Add support for reset

8bbe350

Get basic material handling working

7de0042

Jitter rays, always set work group size to avoid stale values (??)

e9fa88f

github-advanced-security bot found potential problems Feb 6, 2026

View reviewed changes

src/webgpu/compute/wavefront/RayIntersectionKernel.js Fixed Show fixed Hide fixed

gkjohnson added 10 commits February 6, 2026 15:13

Process the correct mount of rays / materials

70ec831

Calculate queue sizes based on api limitations

5c63a50

Simplify screen shader

f9a0469

Add sample debug kernel

0e4fc8a

Fix incorrect queue head / tail offsets

133243d

Use the full queue space

652f7a1

Add comment

30debca

Kernels: use valid initializers

37340a5

Avoid creating uint arrays on initialization of storage buffers

0201edd

Initialize geometry to null

e0f7666

github-advanced-security bot found potential problems Feb 7, 2026

View reviewed changes

src/webgpu/WaveFrontPathTracer.js Fixed Show fixed Hide fixed

src/webgpu/compute/wavefront/RayIntersectionKernel.js Fixed Show fixed Hide fixed

Remove unused imports

79f946e

gkjohnson and others added 8 commits February 7, 2026 15:59

Fix case where we skip the first tile

35cea2b

Remove unnecessary clone

c287b26

Fix target cloning

bf9dd80

Make storage types consistent, remove invalid texture store arguments

c21141e

Remove usages of storage textures <rgba32float, read_write>

642ccb5

Merge pull request #715 from TheBlek/webgpu-ping-pong

b9ad7ba

Remove usages of storage textures <rgba32float, read_write>

update three.js, remove unnecessary compute kernel creation

8c2d7fc

Merge pull request #720 from gkjohnson/webgpu/update-three

fd3dadb

WebGPU: update three.js, remove unnecessary compute kernel creation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

WebGPU Support#713

WebGPU Support#713
gkjohnson wants to merge 72 commits intomainfrom
webgpu-pathtracer

gkjohnson commented Feb 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gkjohnson commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Comments

Conversation

gkjohnson commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gkjohnson commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gkjohnson commented Feb 4, 2026 •

edited

Loading