Skip to content

Commit f23a74a

Browse files
authored
Initial clustered lights implementation (tiled clustering only) (#16866)
# Remaining issues - causes issues with the default physical material falloff. was considering preventing this in the shader but decided to leave it so that if a user wants to use physical falloff they still can as long as the adjust the light ranges accordingly (and the default light range is absolutely large enough for physical falloff) - there's no depth clipping (to prevent lights disappearing when too close or too far from the camera). this does also mean lights behind the camera will be marked as "contributing" to pixels in front of the camera (their light meshes get flipped). couldn't find an easy and clean solution to only prevent clipping if the light is within range and once depth clustering is implemented it will fix this issue (since any lights behind the camera won't end up in any depth slice) # Implementation Details When this work is more complete ill move all this over to the official docs, but for now just using this PR as a rough set of notes/thoughts/todos The implementation is mainly inspired by these CoD slides: https://advances.realtimerendering.com/s2017/2017_Sig_Improved_Culling_final.pdf The main idea is to split the clustering into separate X/Y "tiled" clustering AND a Z "depth" clustering, and then checking if the lights is in both the tile and the depth slice. ## Tiled Clustering This uses the rendering pipeline to draw meshes for each light and writes bits out to a light mask texture if there are pixels infront of the depth buffer (effectively finding lights that intersect the depth buffer). The render target with the light mask texture and depth buffer has a resolution equal to the amount of tiles. ### Depth pre-pass (removed) All the scene geometry is rendered to a low resolution render target. Currently this is repeated per `ClusteredLight`, in future these depth texture might be able to get shared ### Stencil Pass (removed) Draws all the light geometry with depthFunction=GREATER and sets stencil bits to 1. This is combined with the next pass to only mark pixels that contain light geometry both behind AND infront of the depth buffer. However this ended up causing issues due to the low-resolution aliased depth buffer where stencil bits weren't set for the edges of meshes. ![image](https://github.com/user-attachments/assets/66310a5d-8da1-4fb1-be6b-faae7f7fa8cb) In future we could maybe get around this by doing a post-process pass over the depth pre-pass to expand the max depth values out by one tile. Although for now this just adds extra complexity and we probably don't really need this pass. ### Light Mask Pass Each light mesh is drawn with a different index and any fragment pixels which pass the depth test (meaning the light is in front of the depth pre-pass) will set its bit in the light mask texture with an atomic OR. Atleast thats the plan on WebGPU which isn't implemented yet. WebGL doesn't have support for storage textures or atomic operations, so we instead use a float buffer with additive blending. To prevent accuracy issues with floating point values we ensure the mask doesn't go above the bits in the floating-point fraction, which limits us to 23 lights per clusteredlight on most systems. ### Lighting pass This light mask is then queried bit-by-bit in the lighting calculation to figure out which lights are in that "tile" ### Light Meshes All the light meshes are just spheres. To reduce the pixels set for spotlights they are modified in the vertex shader so positions on the sphere that are outside the spotlight angle are set to 0,0,0. This causes the sphere to end up having a more cone shape ## Depth Clustering TODO. not currently implemented. will be implemented in a different PR CoD seems to do this in CPU which makes sense since its only a min/max index per slice so can be cheaply calculated and uploaded. ## Not Implemented / Broken - [x] PBR materials - [x] I have code for point lights but its untested. I feel like they likely won't work rn - [x] moving the camera too close or too far from the lights make them disappear - this is an issue with the camera intersecting with the light geometry, on WebGPU this can be fixed by disabling back-face culling but on WebGL this will cause more issues due to its additive blending - on WebGL we could test if the light mesh intersects with the camera and draw back faces instead - [x] spot lights that point in any other direction but down - to make the light mesh vertex shader simpler we assume a direction of down. this will have to get changed or (better) we just rotate the world matrix by the light direction - [x] lights with a default range. trying to scale the light geometry by `Number.MAX_VALUE` seems to break it - maybe we should clamp it? - [x] WebGPU support - [x] Using instancing for the light meshes - [x] make the amount of tiles configurable - I don't know if this should be an options to specify the number of tiles or to specify the size of each tile - [ ] figure out what to do for `transferToNodeMaterialEffect` I don't plan to support these kinda lights: - anything that requires textures like IES textures or shadowed lights - anything other than point/spot lights - lights using anything other than the default falloff
1 parent 2bb3d1b commit f23a74a

40 files changed

+1333
-19
lines changed

packages/dev/core/src/Buffers/storageBuffer.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,15 @@ export class StorageBuffer {
4646
return this._buffer;
4747
}
4848

49+
/**
50+
* Clears the storage buffer to zeros
51+
* @param byteOffset the byte offset to start clearing (optional)
52+
* @param byteLength the byte length to clear (optional)
53+
*/
54+
public clear(byteOffset?: number, byteLength?: number): void {
55+
this._engine.clearStorageBuffer(this._buffer, byteOffset, byteLength);
56+
}
57+
4958
/**
5059
* Updates the storage buffer
5160
* @param data the data used to update the storage buffer

packages/dev/core/src/Engines/engineCapabilities.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ export interface EngineCapabilities {
2727
maxVertexUniformVectors: number;
2828
/** Maximum number of uniforms per fragment shader */
2929
maxFragmentUniformVectors: number;
30+
/** The number of bits that can be accurately represented in shader floats */
31+
shaderFloatPrecision: number;
3032
/** Defines if standard derivatives (dx/dy) are supported */
3133
standardDerivatives: boolean;
3234
/** Defines if s3tc texture compression is supported */
@@ -80,6 +82,8 @@ export interface EngineCapabilities {
8082
depthTextureExtension: boolean;
8183
/** Defines if float color buffer are supported */
8284
colorBufferFloat: boolean;
85+
/** Defines if float color blending is supported */
86+
blendFloat: boolean;
8387
/** Defines if half float color buffer are supported */
8488
colorBufferHalfFloat?: boolean;
8589
/** Gets disjoint timer query extension (null if not supported) */

packages/dev/core/src/Engines/nativeEngine.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,7 @@ export class NativeEngine extends Engine {
286286
maxDrawBuffers: 8,
287287
maxFragmentUniformVectors: 16,
288288
maxVertexUniformVectors: 16,
289+
shaderFloatPrecision: 23, // TODO: is this correct?
289290
standardDerivatives: true,
290291
astc: null,
291292
pvrtc: null,
@@ -297,6 +298,7 @@ export class NativeEngine extends Engine {
297298
fragmentDepthSupported: false,
298299
highPrecisionShaderSupported: true,
299300
colorBufferFloat: false,
301+
blendFloat: false,
300302
supportFloatTexturesResolve: false,
301303
rg11b10ufColorRenderable: false,
302304
textureFloat: true,

packages/dev/core/src/Engines/nullEngine.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ export class NullEngine extends Engine {
133133
maxVaryingVectors: 16,
134134
maxFragmentUniformVectors: 16,
135135
maxVertexUniformVectors: 16,
136+
shaderFloatPrecision: 10, // Minimum precision for mediump floats WebGL 1
136137
standardDerivatives: false,
137138
astc: null,
138139
pvrtc: null,
@@ -144,6 +145,7 @@ export class NullEngine extends Engine {
144145
fragmentDepthSupported: false,
145146
highPrecisionShaderSupported: true,
146147
colorBufferFloat: false,
148+
blendFloat: false,
147149
supportFloatTexturesResolve: false,
148150
rg11b10ufColorRenderable: false,
149151
textureFloat: false,

packages/dev/core/src/Engines/thinEngine.ts

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -504,6 +504,7 @@ export class ThinEngine extends AbstractEngine {
504504
maxVaryingVectors: this._gl.getParameter(this._gl.MAX_VARYING_VECTORS),
505505
maxFragmentUniformVectors: this._gl.getParameter(this._gl.MAX_FRAGMENT_UNIFORM_VECTORS),
506506
maxVertexUniformVectors: this._gl.getParameter(this._gl.MAX_VERTEX_UNIFORM_VECTORS),
507+
shaderFloatPrecision: 0,
507508
parallelShaderCompile: this._gl.getExtension("KHR_parallel_shader_compile") || undefined,
508509
standardDerivatives: this._webGLVersion > 1 || this._gl.getExtension("OES_standard_derivatives") !== null,
509510
maxAnisotropy: 1,
@@ -531,6 +532,7 @@ export class ThinEngine extends AbstractEngine {
531532
drawBuffersExtension: false,
532533
maxMSAASamples: 1,
533534
colorBufferFloat: !!(this._webGLVersion > 1 && this._gl.getExtension("EXT_color_buffer_float")),
535+
blendFloat: this._gl.getExtension("EXT_float_blend") !== null,
534536
supportFloatTexturesResolve: false,
535537
rg11b10ufColorRenderable: false,
536538
colorBufferHalfFloat: !!(this._webGLVersion > 1 && this._gl.getExtension("EXT_color_buffer_half_float")),
@@ -732,6 +734,19 @@ export class ThinEngine extends AbstractEngine {
732734

733735
if (vertexhighp && fragmenthighp) {
734736
this._caps.highPrecisionShaderSupported = vertexhighp.precision !== 0 && fragmenthighp.precision !== 0;
737+
this._caps.shaderFloatPrecision = Math.min(vertexhighp.precision, fragmenthighp.precision);
738+
}
739+
// This will check both the capability and the `useHighPrecisionFloats` option
740+
if (!this._shouldUseHighPrecisionShader) {
741+
const vertexmedp = this._gl.getShaderPrecisionFormat(this._gl.VERTEX_SHADER, this._gl.MEDIUM_FLOAT);
742+
const fragmentmedp = this._gl.getShaderPrecisionFormat(this._gl.FRAGMENT_SHADER, this._gl.MEDIUM_FLOAT);
743+
if (vertexmedp && fragmentmedp) {
744+
this._caps.shaderFloatPrecision = Math.min(vertexmedp.precision, fragmentmedp.precision);
745+
}
746+
}
747+
if (this._caps.shaderFloatPrecision < 10) {
748+
// WebGL spec requires mediump precision to atleast be 10
749+
this._caps.shaderFloatPrecision = 10;
735750
}
736751
}
737752

packages/dev/core/src/Engines/webgpuEngine.ts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -868,6 +868,7 @@ export class WebGPUEngine extends ThinWebGPUEngine {
868868
maxVaryingVectors: this._deviceLimits.maxInterStageShaderVariables,
869869
maxFragmentUniformVectors: Math.floor(this._deviceLimits.maxUniformBufferBindingSize / 4),
870870
maxVertexUniformVectors: Math.floor(this._deviceLimits.maxUniformBufferBindingSize / 4),
871+
shaderFloatPrecision: 23, // WGSL always uses IEEE-754 binary32 floats (which have 23 bits of significand)
871872
standardDerivatives: true,
872873
astc: (this._deviceEnabledExtensions.indexOf(WebGPUConstants.FeatureName.TextureCompressionASTC) >= 0 ? true : undefined) as any,
873874
s3tc: (this._deviceEnabledExtensions.indexOf(WebGPUConstants.FeatureName.TextureCompressionBC) >= 0 ? true : undefined) as any,
@@ -880,6 +881,7 @@ export class WebGPUEngine extends ThinWebGPUEngine {
880881
fragmentDepthSupported: true,
881882
highPrecisionShaderSupported: true,
882883
colorBufferFloat: true,
884+
blendFloat: this._deviceEnabledExtensions.indexOf(WebGPUConstants.FeatureName.Float32Blendable) >= 0,
883885
supportFloatTexturesResolve: false, // See https://github.com/gpuweb/gpuweb/issues/3844
884886
rg11b10ufColorRenderable: this._deviceEnabledExtensions.indexOf(WebGPUConstants.FeatureName.RG11B10UFloatRenderable) >= 0,
885887
textureFloat: true,
@@ -3939,6 +3941,16 @@ export class WebGPUEngine extends ThinWebGPUEngine {
39393941
return this._createBuffer(data, creationFlags | Constants.BUFFER_CREATIONFLAG_STORAGE, label);
39403942
}
39413943

3944+
/**
3945+
* Clears a storage buffer to zeroes
3946+
* @param storageBuffer the storage buffer to clear
3947+
* @param byteOffset the byte offset to start clearing (optional)
3948+
* @param byteLength the byte length to clear (optional)
3949+
*/
3950+
public clearStorageBuffer(storageBuffer: DataBuffer, byteOffset?: number, byteLength?: number): void {
3951+
this._renderEncoder.clearBuffer(storageBuffer.underlyingResource, byteOffset, byteLength);
3952+
}
3953+
39423954
/**
39433955
* Updates a storage buffer
39443956
* @param buffer the storage buffer to update

0 commit comments

Comments
 (0)