Skip to content

Conversation

@nhat-nguyen
Copy link
Contributor

Previously, before the conversion to the PtrDialect for fallback, the structured-to-memref pass has to convert loop's iter-args with triton pointer to unranked memref. This conversion ensures all types coming out of triton-shared are mlir built-in types and therefore allows the CPU backend to correctly lower the IR to llvm. However, in reality, structured ops do not need to use the loop iter-args since ptr-analysis generates load/store ops that directly use the kernel arguments as source; this means the conversion is mostly unnecessary.

With the introduction of the fallback using the PtrDialect (triton-to-ptr), we also convert the loop iter-args of triton pointer type to PtrDialect's ptr type. This conversion, along with the conversion to unranked memref above, means we will end up with unrealized_conversion_cast ops that convert back and forth between these two types when handling triton programs that have mixed uses of structured and unstructured accesses in loops.

To solve this issue, we:

@nhat-nguyen nhat-nguyen marked this pull request as ready for review June 18, 2025 18:57
@nhat-nguyen nhat-nguyen merged commit e1ca305 into main Jun 19, 2025
3 checks passed
@nhat-nguyen nhat-nguyen deleted the nhat/cleanup branch June 19, 2025 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants