Commit ee2edf3
[ROCm][CK][Inductor] enable gfx950 for max autotune with CK (pytorch#159195)
+ update inductor config for new gfx arch
+ fixes in codegen for conv2d and ck-tile matmul
+ use appropriate fp8 dtypes
+ test cleanup
Pull Request resolved: pytorch#159195
Approved by: https://github.com/chenyang781 parent 51eb41a commit ee2edf3
File tree
7 files changed
+168
-134
lines changed- test/inductor
- torch/_inductor
- codegen/rocm
7 files changed
+168
-134
lines changed
0 commit comments