Skip to content

Optimize EX31 precompiled path tracer build and startup#1

Open
AnastaZIuk wants to merge 15 commits intodiff-masterfrom
unroll
Open

Optimize EX31 precompiled path tracer build and startup#1
AnastaZIuk wants to merge 15 commits intodiff-masterfrom
unroll

Conversation

@AnastaZIuk
Copy link
Owner

@AnastaZIuk AnastaZIuk commented Mar 24, 2026

Summary

  • add PATH_TRACER_BUILD_MODE with WALLTIME_OPTIMIZED as the default and SPECIALIZED as the alternate triangle-method layout
  • move EX31 to a packaged precompiled SPIR-V path instead of rebuilding the path-tracer shaders at runtime
  • keep the full EX31 polygon-method feature surface while letting the default build collapse triangle methods into one shared runtime-selected module
  • add runtime-uniform shared proxy plumbing for the wall-time-optimized layout instead of duplicating the same wrapper logic across multiple SPIR-V inputs
  • add persistent pipeline cache support with --pipeline-cache-dir, --clear-pipeline-cache, and a generated path_tracer.runtime.json that resolves a relative pipeline/cache root from the common bin directory and falls back to LocalAppData outside the CMake flow
  • persist prepared-shader validation markers next to the example cache so warm runs keep validation without repaying it for the same SPIR-V blob
  • warm all known render and resolve variants in the background after the first submit so later switches stay hot without blocking the UI
  • compact the control panel and expose the effective build, runtime, and cache state directly in the UI
  • pair this with AnastaZIuk/Nabla#2

Note on shape

A noticeable part of the current example-side proxy and permutation scaffolding exists because this branch cannot assume Devsh-Graphics-Programming/Nabla#988 is merged. If that PR lands, a large part of this glue can move out of the example and the packaged SPIR-V setup can be reduced materially.

Root cause

The base EX31 path had two separate problems.

First, EX31 started as a runtime-oriented example in eab0f70c and 2f77555ce. Shader selection and compute pipeline creation lived in runtime from the start. That runtime matrix then expanded with persistent workgroups in 153556152 and with RWMC in 3d206fd4. The current line locations in main.cpp come from later refactors, but the semantic shape predates them.

Second, once EX31 is moved to packaged SPIR-V, startup repays pipeline creation unless those packaged variants share a real pipeline cache and the prepared SPIR-V path avoids revalidating the same blob every run. The base render and resolve compute pipeline creation sites pass nullptr cache in main.cpp#L404-L478. That runtime creation model originates in 2f77555ce and was widened by 153556152 and 3d206fd4.

Only triangle has three distinct polygon-method implementations. Specializing those methods into separate precompiled entrypoints does not add only thin wrappers. It multiplies heavy triangle-side path tracing instantiations and pushes much more work into the DXC/SPIR-V backend.

Validation

Validation was run on AMD Ryzen 5 5600G with Radeon Graphics (6C/12T).

A Visual Studio Debug x64 full rebuild of the SPIR-V project completed in:

  • WALLTIME_OPTIMIZED = 12.785 s
  • SPECIALIZED = 18.314 s
  • SPECIALIZED regression: +5.529 s which is +43.25%
  • WALLTIME_OPTIMIZED improvement over SPECIALIZED: 30.19%

SPECIALIZED is materially slower because it multiplies the heavy triangle-side path tracing instantiations and pushes more work into the DXC/SPIR-V backend, so the default is WALLTIME_OPTIMIZED.

Runtime validation on the final state:

  • Release cold clear: first_render_submit_ms=2383
  • Release warm cache hit: loaded_from_disk=1, first_render_submit_ms=1793
  • RelWithDebInfo cold clear: first_render_submit_ms=2245
  • RelWithDebInfo warm cache hit: loaded_from_disk=1, first_render_submit_ms=1598
  • Debug cold clear: first_render_submit_ms=11781
  • Debug warm cache hit: loaded_from_disk=1, first_render_submit_ms=2698
  • background warmup starts immediately after the first submit with queued_jobs=21 and max_parallel=11 on this 6C/12T CPU
  • validated --pipeline-cache-dir <path> and --clear-pipeline-cache
  • validated default no-arg dev runs on Release, RelWithDebInfo, and Debug; the generated path_tracer.runtime.json resolves pipeline/cache relative to the common bin directory

@AnastaZIuk AnastaZIuk changed the title Implement EX31 walltime-optimized path tracer build Optimize EX31 precompiled path tracer build and startup Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant