Skip to content

[TritonIntelGPU] Add Subgroup2DBlockLoadOpConversion for ttig.2d_block_load#6804

Open
whitneywhtsang wants to merge 4 commits intomainfrom
whitneywhtsang/Subgroup2DBlockLoadOpConversion
Open

[TritonIntelGPU] Add Subgroup2DBlockLoadOpConversion for ttig.2d_block_load#6804
whitneywhtsang wants to merge 4 commits intomainfrom
whitneywhtsang/Subgroup2DBlockLoadOpConversion

Conversation

@whitneywhtsang
Copy link
Copy Markdown
Contributor

@whitneywhtsang whitneywhtsang commented May 1, 2026

Add LLVM lowering for ttig.2d_block_load → triton_gen.2Dblockload. The conversion mirrors DescriptorLoadOpToBlockIOConversion but reads decomposed surface parameters (width, height, pitch, offsets) directly from the op instead of unpacking a tensor descriptor struct.

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/Subgroup2DBlockLoadOpConversion branch from 4ee80de to 7fb3426 Compare May 1, 2026 04:01
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/LowerTo2DBlockLoad_descriptor branch from 9e3cbf6 to 5a34d54 Compare May 1, 2026 13:44
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/Subgroup2DBlockLoadOpConversion branch from 7fb3426 to a2b36e6 Compare May 1, 2026 13:51
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/LowerTo2DBlockLoad_descriptor branch 2 times, most recently from 81877c2 to f75e245 Compare May 1, 2026 20:15
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/Subgroup2DBlockLoadOpConversion branch from a2b36e6 to 356731b Compare May 1, 2026 20:15
Base automatically changed from whitneywhtsang/LowerTo2DBlockLoad_descriptor to main May 1, 2026 22:58
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/Subgroup2DBlockLoadOpConversion branch from 356731b to b904a45 Compare May 1, 2026 23:02
Add rank-reducing (3D descriptor → 2D result) and same-rank batch (3D
descriptor → 3D result) test cases to descriptor-load.mlir to cover the
batch index folding into tt.addptr.

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/Subgroup2DBlockLoadOpConversion branch from b904a45 to 653db35 Compare May 1, 2026 23:29
…k_load

Add LLVM lowering for ttig.2d_block_load → triton_gen.2Dblockload. The
conversion mirrors DescriptorLoadOpToBlockIOConversion but reads
decomposed surface parameters (width, height, pitch, offsets) directly
from the op instead of unpacking a tensor descriptor struct.

Verified to produce identical triton_gen.2Dblockload output as the old
DescriptorLoadOpToBlockIOConversion for all 4 dot operand × memory
layout combinations (dot A/B × row/column major) and NaN padding.

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/Subgroup2DBlockLoadOpConversion branch from 653db35 to 00ac019 Compare May 1, 2026 23:42
Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
Descriptor loads eligible for 2D block I/O are now converted to
ttig.2d_block_load by the LowerTo2DBlockLoad TTGIR pass and lowered
by Subgroup2DBlockLoadOpConversion.

Signed-off-by: Whitney Tsang <whitney.tsang@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant