Skip to content

Narrow blanket SPIR-V legalization work in optimizer recipes#6612

Open
AnastaZIuk wants to merge 5 commits intoKhronosGroup:mainfrom
Devsh-Graphics-Programming:unroll
Open

Narrow blanket SPIR-V legalization work in optimizer recipes#6612
AnastaZIuk wants to merge 5 commits intoKhronosGroup:mainfrom
Devsh-Graphics-Programming:unroll

Conversation

@AnastaZIuk
Copy link
Copy Markdown

@AnastaZIuk AnastaZIuk commented Mar 20, 2026

Summary

  • add SSARewriteMode
  • add RegisterLegalizationPasses(bool preserve_interface, bool include_loop_unroll, SSARewriteMode ssa_rewrite_mode)
  • make legalization-time SSA rewrite conditional on ssa_rewrite_mode
  • make legalization-time full loop unroll conditional on include_loop_unroll
  • keep the default performance recipe from always materializing LoopControl::Unroll
  • narrow default performance SSA rewrite to SSARewriteMode::SpecialTypes
  • replace the default performance recipe's global redundancy elimination with local redundancy elimination
  • remove blanket multidimensional-array legalization from the generic legalization tail
  • preserve legal OpImageTexelPointer image operands in LocalSingleStoreElim

Root cause

The current SPIR-V optimizer recipes still carry old blanket unroll decisions that inflate the module and then pay for expensive cleanup over that self-inflated IR.

LoopControl::Unroll as an IR hint is not the problem. The expensive part is treating that hint as a blanket request to immediately materialize full unroll in the generic optimizer path even when legality does not require it.

The same pattern existed in the generic SSA cleanup path. The hard legality constraints are narrow and concentrated around special cases such as opaque or resource-like objects, but the old recipe was still paying for broader cleanup over generic IR.

DXC can still provide targeted producer-side signals for the narrower correctness-sensitive cases in microsoft/DirectXShaderCompiler#8283.

The narrower recipe also exposed one existing image-atomic cleanup dependency. In that path local single-store elimination could rewrite through copied image values and leave OpImageTexelPointer with a non-pointer image operand. This branch now fixes that directly instead of restoring blanket cleanup.

Spec basis

The core SPIR-V specification is direct here:

UnrollPerformance hint. Strong request to unroll or unwind this loop.

DontUnrollPerformance hint. Strong request to keep this loop as a loop, without unrolling.

Spec:

Khronos guidance on offline SPIR-V transforms is also direct:

general loop unwinding or unrolling

should be avoided in off-line transforms of SPIR-V meant to be portable across devices.

Such controls should be respected by target devices.

Whitepaper:

The same spec split also matters for SSA cleanup. The hard legality rules are narrow:

Image, sampler, and sampled image objects must not appear as operands to OpPhi instructions, or OpSelect instructions, or any instructions other than the image or sampler instructions specified to operate on them.

All OpSampledImage instructions must be in the same block in which their Result <id> are consumed.

Spec:

And core SPIR-V is explicit that generic storage-class reasoning does not automatically apply to intermediate SSA values:

Intermediate values do not form a storage class, and unless stated otherwise, storage class-based restrictions are not restrictions on intermediate objects and their types.

Spec:

For the image-atomic follow-up, OpImageTexelPointer is also explicit:

Image must have a type of OpTypePointer with Type OpTypeImage.

Spec:

Benchmark

Validation

  • fresh full local CodeGenSPIRV on the companion DXC branch passes with 1438 expected passes, 2 expected failures, and 0 unexpected failures

Companion DXC PR:
microsoft/DirectXShaderCompiler#8283

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 20, 2026

CLA assistant check
All committers have signed the CLA.

@AnastaZIuk AnastaZIuk marked this pull request as ready for review March 20, 2026 18:18
@AnastaZIuk AnastaZIuk changed the title Narrow blanket SPIR-V loop unroll in optimizer recipes Narrow blanket SPIR-V legalization work in optimizer recipes Mar 20, 2026
Copy link
Copy Markdown
Collaborator

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have responded on the corresponding DXC pr: microsoft/DirectXShaderCompiler#8283 (review).

Comment on lines 133 to 134
// Make sure uses and definitions are in the same function.
.RegisterPass(CreateInlineExhaustivePass())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the purpose here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline enables many other optimizations. We do not implement inter-procedural-optimizations. If you are going to copy-propagate something written to in one function, and used in another function, the have to be inlined.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I just hacked a NBL_REF_ARG via expanding vk::ext_reference on regular function inout parameters, so we have less copies, but yes makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants