[CodeGen] Improve scf.for bufferization and make hoisting allocation work #23318

hanhanW · 2026-01-29T01:49:50Z

The revision enables allowReturnAllocsFromLoops in bufferization, which matches the upstream behavior. Otherwise, it can trigger an error like:

error: Yield operand #1 is not equivalent to the corresponding iter bbArg

In this context, a memref.alloca can be created inside the loop and the dynamic size can be queried from iter_arg. The ValueBoundsConstraintSet check does not support the analysis, because the runtime dimension values can still differ. E.g.,

%result = scf.for ... iter_args(%iter = %init) -> (memref<?xf32>) {
  %new_buf = memref.alloca(%some_other_size) : memref<?xf32>
  scf.yield %new_buf : memref<?xf32>  // same type, different runtime size
}

It is weird, but it is allowed. Thus, we need to handle such case in hoistOneStaticallyBoundAllocation.

The revision verifies the dimension is preserved, via:

The yield operand (after walking through cast/subview) is the iter_arg.
The yield operand traces to an alloca whose shape matches the iter_arg and whose dynamic size at dimIndex is memref.dim of the iter_arg.
The yield operand is a scf.for result whose init arg is the iter_arg and the inner loop also preserves the dimension (recursive).

Fixes #16956

ci-extra: test_torch

hanhanW · 2026-01-29T01:53:15Z

I don't expect this impacting the performance. It could enable some failing tests though.

…work The revision enables `allowReturnAllocsFromLoops` in bufferization, which matches the upstream behavior. Otherwise, it can trigger an error like: ``` error: Yield operand #1 is not equivalent to the corresponding iter bbArg ``` In this context, a `memref.alloca` can be created inside the loop and the dynamic size can be queried from iter_arg. The ValueBoundsConstraintSet check does not support the analysis, because the runtime dimension values can still differ. E.g., ```mlir %result = scf.for ... iter_args(%iter = %init) -> (memref<?xf32>) { %new_buf = memref.alloca(%some_other_size) : memref<?xf32> scf.yield %new_buf : memref<?xf32> // same type, different runtime size } ``` It is weird, but it is allowed. Thus, we need to handle such case in `hoistOneStaticallyBoundAllocation`. The revision verifies the dimension is preserved, via: 1. The yield operand (after walking through cast/subview) is the iter_arg. 2. The yield operand traces to an alloca whose shape matches the iter_arg and whose dynamic size at `dimIndex` is `memref.dim` of the iter_arg. 3. The yield operand is a scf.for result whose init arg is the iter_arg and the inner loop also preserves the dimension (recursive). Signed-off-by: hanhanW <[email protected]>

amd-eochoalo · 2026-01-29T13:54:06Z

@hanhanW do you know what's up with the linux_x64_bazel tests' compilation failure?

hanhanW · 2026-01-29T18:16:17Z

@hanhanW do you know what's up with the linux_x64_bazel tests' compilation failure?

It is just missing a dep in BUILD.bazel.

MaheshRavishankar

Hmmm, I think this is going the opposite of what the end state should be. I am not sure we want to support cases where we end up with local allocas. It almost always indicates something off in my view.

hanhanW · 2026-01-30T01:37:58Z

Hmmm, I think this is going the opposite of what the end state should be. I am not sure we want to support cases where we end up with local allocas. It almost always indicates something off in my view.

I thought we allow small local allocas as long as they are statically bounded, which is already happening for years?

hanhanW requested review from MaheshRavishankar, Max191 and qedawkins as code owners January 29, 2026 01:49

hanhanW mentioned this pull request Jan 29, 2026

[CPU] Support dynamic attention by tiling K1 when needed. #23304

Merged

hanhanW force-pushed the users/hanhanW/improve-bufferization-issue-16956 branch from 739f413 to 02e5678 Compare January 29, 2026 01:55

hanhanW mentioned this pull request Jan 30, 2026

Compiler crash in LLVMCPUSelectLoweringStrategy with dynamic-shape iree_linalg_ext.attention #23277

Open

MaheshRavishankar reviewed Jan 30, 2026

View reviewed changes

hanhanW requested a review from MaheshRavishankar January 30, 2026 19:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeGen] Improve scf.for bufferization and make hoisting allocation work #23318

[CodeGen] Improve scf.for bufferization and make hoisting allocation work #23318

hanhanW commented Jan 29, 2026

Uh oh!

hanhanW commented Jan 29, 2026

Uh oh!

amd-eochoalo commented Jan 29, 2026

Uh oh!

hanhanW commented Jan 29, 2026

Uh oh!

MaheshRavishankar left a comment

Uh oh!

hanhanW commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[CodeGen] Improve scf.for bufferization and make hoisting allocation work #23318

Are you sure you want to change the base?

[CodeGen] Improve scf.for bufferization and make hoisting allocation work #23318

Conversation

hanhanW commented Jan 29, 2026

Uh oh!

hanhanW commented Jan 29, 2026

Uh oh!

amd-eochoalo commented Jan 29, 2026

Uh oh!

hanhanW commented Jan 29, 2026

Uh oh!

MaheshRavishankar left a comment

Choose a reason for hiding this comment

Uh oh!

hanhanW commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants