How does triton reason about whether to issue vectorized loads/stores or not ? #7162

Sh0g0-1758 · 2025-06-12T05:53:53Z

Sh0g0-1758
Jun 12, 2025

Hey! I am working on porting some kernels from cuda to triton and I noticed that triton is not emitting vectorized loads/stores. I understand that my starting base and mask are not a multiple of 4 and hence triton can't reason about the outliers but I think for some specific blocks, a mask is not really needed. I am hoping to write some compiler pass that can give a hint to the compiler about this. Can someone please help me figure out how to do that ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How does triton reason about whether to issue vectorized loads/stores or not ? #7162

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How does triton reason about whether to issue vectorized loads/stores or not ? #7162

Uh oh!

Sh0g0-1758 Jun 12, 2025

Replies: 0 comments

Sh0g0-1758
Jun 12, 2025