Optimize EX31 precompiled path tracer build and startup#1
Open
AnastaZIuk wants to merge 15 commits intodiff-masterfrom
Open
Optimize EX31 precompiled path tracer build and startup#1AnastaZIuk wants to merge 15 commits intodiff-masterfrom
AnastaZIuk wants to merge 15 commits intodiff-masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PATH_TRACER_BUILD_MODEwithWALLTIME_OPTIMIZEDas the default andSPECIALIZEDas the alternate triangle-method layout--pipeline-cache-dir,--clear-pipeline-cache, and a generatedpath_tracer.runtime.jsonthat resolves a relativepipeline/cacheroot from the common bin directory and falls back toLocalAppDataoutside the CMake flowNote on shape
A noticeable part of the current example-side proxy and permutation scaffolding exists because this branch cannot assume Devsh-Graphics-Programming/Nabla#988 is merged. If that PR lands, a large part of this glue can move out of the example and the packaged SPIR-V setup can be reduced materially.
Root cause
The base EX31 path had two separate problems.
First, EX31 started as a runtime-oriented example in
eab0f70cand2f77555ce. Shader selection and compute pipeline creation lived in runtime from the start. That runtime matrix then expanded with persistent workgroups in153556152and with RWMC in3d206fd4. The current line locations inmain.cppcome from later refactors, but the semantic shape predates them.Second, once EX31 is moved to packaged SPIR-V, startup repays pipeline creation unless those packaged variants share a real pipeline cache and the prepared SPIR-V path avoids revalidating the same blob every run. The base render and resolve compute pipeline creation sites pass
nullptrcache inmain.cpp#L404-L478. That runtime creation model originates in2f77555ceand was widened by153556152and3d206fd4.Only triangle has three distinct polygon-method implementations. Specializing those methods into separate precompiled entrypoints does not add only thin wrappers. It multiplies heavy triangle-side path tracing instantiations and pushes much more work into the DXC/SPIR-V backend.
Validation
Validation was run on AMD Ryzen 5 5600G with Radeon Graphics (6C/12T).
A Visual Studio
Debug x64full rebuild of the SPIR-V project completed in:WALLTIME_OPTIMIZED = 12.785 sSPECIALIZED = 18.314 sSPECIALIZEDregression:+5.529 swhich is+43.25%WALLTIME_OPTIMIZEDimprovement overSPECIALIZED:30.19%SPECIALIZEDis materially slower because it multiplies the heavy triangle-side path tracing instantiations and pushes more work into the DXC/SPIR-V backend, so the default isWALLTIME_OPTIMIZED.Runtime validation on the final state:
Releasecold clear:first_render_submit_ms=2383Releasewarm cache hit:loaded_from_disk=1,first_render_submit_ms=1793RelWithDebInfocold clear:first_render_submit_ms=2245RelWithDebInfowarm cache hit:loaded_from_disk=1,first_render_submit_ms=1598Debugcold clear:first_render_submit_ms=11781Debugwarm cache hit:loaded_from_disk=1,first_render_submit_ms=2698queued_jobs=21andmax_parallel=11on this 6C/12T CPU--pipeline-cache-dir <path>and--clear-pipeline-cacheRelease,RelWithDebInfo, andDebug; the generatedpath_tracer.runtime.jsonresolvespipeline/cacherelative to the common bin directory