Skip to content

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Aug 13, 2025

This generalizes the CUDA wrappers to also support AMD GPUs. I moved some of the higher GPU logic into src so that we don't have to duplicate code, and provided "fallbacks" for the lower-level LAPACK like methods (ROCm and CUDA implement the same APIs) so that the extensions should only need to extend the LAPACK API rather than qr_null! and such. I also added CI support for AMD.

Since I don't have an AMD GPU to test on, this is going to be "CI based debugging" for a bit 😅 . I did check locally that the CUDA tests pass.

@kshyatt kshyatt requested review from Jutho and lkdvos August 13, 2025 12:07
@kshyatt kshyatt force-pushed the ksh/amd branch 17 times, most recently from 43ee4d2 to fd5e540 Compare August 13, 2025 16:51
@codecov
Copy link

codecov bot commented Aug 13, 2025

Codecov Report

❌ Patch coverage is 89.57346% with 22 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
ext/MatrixAlgebraKitAMDGPUExt/yarocsolver.jl 86.66% 10 Missing ⚠️
src/implementations/svd.jl 89.70% 7 Missing ⚠️
src/implementations/qr.jl 93.87% 3 Missing ⚠️
...ixAlgebraKitAMDGPUExt/MatrixAlgebraKitAMDGPUExt.jl 84.61% 2 Missing ⚠️
Files with missing lines Coverage Δ
...MatrixAlgebraKitCUDAExt/MatrixAlgebraKitCUDAExt.jl 85.71% <100.00%> (+10.71%) ⬆️
ext/MatrixAlgebraKitCUDAExt/yacusolver.jl 92.74% <ø> (ø)
src/MatrixAlgebraKit.jl 100.00% <ø> (ø)
...ixAlgebraKitAMDGPUExt/MatrixAlgebraKitAMDGPUExt.jl 84.61% <84.61%> (ø)
src/implementations/qr.jl 96.31% <93.87%> (-1.05%) ⬇️
src/implementations/svd.jl 93.13% <89.70%> (-1.72%) ⬇️
ext/MatrixAlgebraKitAMDGPUExt/yarocsolver.jl 86.66% <86.66%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I had one question about the way you define things for AbstractMatrix and Union{CUAlg, ROCAlg}, which is just to check if this causes any ambiguities or not - one of the goals of this package was to allow overloading based on your own "operator", and therefore it tends to be easier to only use concrete types for the algorithms. This being said, from what I can tell it should be fine here since you probably always want to define things for both at the same time anyways, I just wanted to hear if you have any thoughts on that.

@kshyatt
Copy link
Member Author

kshyatt commented Aug 14, 2025

I had one question about the way you define things for AbstractMatrix and Union{CUAlg, ROCAlg}, which is just to check if this causes any ambiguities or not - one of the goals of this package was to allow overloading based on your own "operator", and therefore it tends to be easier to only use concrete types for the algorithms. This being said, from what I can tell it should be fine here since you probably always want to define things for both at the same time anyways, I just wanted to hear if you have any thoughts on that.

If I understood the question correctly, I think this should avoid ambiguities since in the package extensions the functions only accept the appropriate Strided*Array where * can be Cu or ROC -- so if you pass a CUDA array to the ROCm wrapper, it will fall back to the AbstractArray case and throw a MethodError. One option would also be to define a new abstract type GPUSVDPolar <: AbstractAlgorithm and have the CUDA/AMD concrete types subtype this.

@lkdvos
Copy link
Member

lkdvos commented Aug 14, 2025

I think I would prefer not to have too many abstract types added, so probably the const union is fine. Thinking a bit more through it I think you are right that either we hit the MethodError or we hit the appropriate Strided*Array implementation, so should be good!

@kshyatt
Copy link
Member Author

kshyatt commented Aug 14, 2025

Should I maybe add an explanatory comment there about this, since it was confusing enough that you had to ask?

@lkdvos
Copy link
Member

lkdvos commented Aug 14, 2025

I'm fine with either, I think it's not necessarily confusing, I tend to overworry a bit about ambiguities since there's a large amount of downstream packages for which this would be very annoying to work around

@kshyatt
Copy link
Member Author

kshyatt commented Aug 14, 2025 via email

@lkdvos
Copy link
Member

lkdvos commented Aug 14, 2025

We definitely should, unfortunately these still are a bit in development 😀 Thinking of TensorKit, BlockSparseArrays etc right now

@kshyatt
Copy link
Member Author

kshyatt commented Aug 14, 2025

In that case, are we good to merge?

@lkdvos
Copy link
Member

lkdvos commented Aug 14, 2025

I think so!
(It might be me, but I can't see the updates from the resolved conversations, is this just not pushed yet?)

@kshyatt kshyatt merged commit 8877460 into main Aug 14, 2025
10 checks passed
@kshyatt kshyatt deleted the ksh/amd branch August 14, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants