Skip to content

Conversation

@dnikolaev-amd
Copy link

This PR fixes:

  • test_matmul_cuda.py::TestFP8MatmulCudaCUDA::test_float8_basics_cuda - AssertionError: RuntimeError not raised
  • test_matmul_cuda.py::TestFP8MatmulCudaCUDA::test_scaled_mm_vs_emulated_row_wise_bfloat16_cuda - AssertionError: Tensor-likes are not close!

need to swap A_SCALE and B_SCALE descriptors data if use_rowwise like as HIPBLASLT_VEC_EXT

Fixes SWDEV-544098

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Oct 22, 2025

Jenkins build for efe26b4767ed81ed164dfdbd99473327691021d7 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@jagadish-amd
Copy link

in PT 2.7, scales are swapped in https://github.com/ROCm/pytorch/blob/release/2.7/aten/src/ATen/native/cuda/Blas.cpp#L131
but in Pt 2.6, I do not see scales as parameter in https://github.com/ROCm/pytorch/blob/release/2.6/aten/src/ATen/native/cuda/Blas.cpp#L98
That might be the reason why the scale swaps are required further down the call in scaled_gemm in CUDABlas.cpp

@jeffdaily jeffdaily merged commit d10296c into release/2.6 Oct 28, 2025
3 of 6 checks passed
@jeffdaily jeffdaily deleted the dnikolae/rel2.6_fixes branch October 28, 2025 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants