How does triton reason about whether to issue vectorized loads/stores or not ? #7162
Unanswered
Sh0g0-1758
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey! I am working on porting some kernels from cuda to triton and I noticed that triton is not emitting vectorized loads/stores. I understand that my starting base and mask are not a multiple of 4 and hence triton can't reason about the outliers but I think for some specific blocks, a mask is not really needed. I am hoping to write some compiler pass that can give a hint to the compiler about this. Can someone please help me figure out how to do that ?
Beta Was this translation helpful? Give feedback.
All reactions