Skip to content

Conversation

@Jutho
Copy link
Member

@Jutho Jutho commented Oct 31, 2025

This is currently just a proof of principle, but I think it is not too hard, using my old attempt at a generic linear algebra library (never published) to implement native algorithms. Currently only has a QR (which is easy), but with a performance that is surprisingly identical to GenericLinearAlgebra.

Hence, this is an alternative to #87 , but we can easily have both, as it would take some time to bring everything up to date.

@codecov
Copy link

codecov bot commented Nov 1, 2025

Codecov Report

❌ Patch coverage is 0% with 238 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/common/householder.jl 0.00% 110 Missing ⚠️
src/implementations/lq.jl 0.00% 62 Missing ⚠️
src/implementations/qr.jl 0.00% 62 Missing ⚠️
src/interface/lq.jl 0.00% 2 Missing ⚠️
src/interface/qr.jl 0.00% 2 Missing ⚠️
Files with missing lines Coverage Δ
src/MatrixAlgebraKit.jl 100.00% <ø> (ø)
src/interface/decompositions.jl 100.00% <ø> (ø)
src/interface/lq.jl 18.75% <0.00%> (-31.25%) ⬇️
src/interface/qr.jl 18.18% <0.00%> (-48.49%) ⬇️
src/implementations/lq.jl 31.72% <0.00%> (-67.21%) ⬇️
src/implementations/qr.jl 33.82% <0.00%> (-62.79%) ⬇️
src/common/householder.jl 0.00% <0.00%> (ø)

... and 29 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kshyatt
Copy link
Member

kshyatt commented Nov 2, 2025

I think having the native AD can be helpful for testing since the compilers in both Mooncake and Enzyme are pretty good and the guidance is to only write a custom rule if you really have to -- such as when you have a call to a foreign library like BLAS.

@Jutho
Copy link
Member Author

Jutho commented Nov 2, 2025

This now contains a fully functional QR and LQ and tests, and could in principle be merged as is, with other factorizations be done in separate PRs. However, going forward, this raises a number of interesting questions: @lkdvos and @kshyatt

  1. Currently, the Native_HouseholderQR/LQ algorithm is not registered as default. Should this be the default for AbstractMatrix{BigFloat} and AbstractMatrix{Complex{BigFloat}}, with then other scalar types (e.g. https://github.com/JuliaMath/DoubleFloats.jl) needing separate registration (possibly in package extensions). Or do we register the native algorithms as a default for all AbstractMatrix, hoping that the LAPACK / GPU stuff is always correctly selected by their more specific default registration signature. Also, how should a non-strided AbstractMatrix{Float64} be handled in a non-mutating method so that it is anyway copied?

  2. In principle, we can natively AD through these implementations. Is this interesting for testing purposes? Or do we even want to have the native AD as the default behavior?

@lkdvos
Copy link
Member

lkdvos commented Nov 3, 2025

For 1, I think that since the arbitrary element types are the primary candidates for these native implementations, it would make sense to just register the native implementations as the defaults. I would prefer to avoid having to define overloads for every possible combination of weird element type and weird array type, and this seems like a solution that might often just work?

For 2, I don't really know. We don't really have a good interface for selecting different AD modes right now, (nor for selecting tolerances etc), so while I don't mind having the option to use native AD, I would also just want to see what the performances are like before we invest into trying to make this accessible?

test/lq.jl Outdated

eltypes = (Float32, Float64, ComplexF32, ComplexF64)
lapack_eltypes = (Float32, Float64, ComplexF32, ComplexF64)
native_eltypes = (lapack_eltypes..., BigFloat, Complex{BigFloat})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also try Float16 for the native eltypes?

@kshyatt
Copy link
Member

kshyatt commented Nov 3, 2025

Since this provides LQ and QR could we also test the left_orth and left_null and right sided versions using these?

@kshyatt
Copy link
Member

kshyatt commented Nov 3, 2025

Also, it looks like we are not yet testing the GPU support for these and I suspect we'll get scalar indexing errors -- should we add in GPU tests to this PR?

@Jutho
Copy link
Member Author

Jutho commented Nov 3, 2025

This is definitely not meant for GPUs (which I still do not know anything about). Do people ever want to do non-native scalar types on GPUs?

Adding Float16 to the tests is a worthwhile suggestion.

So I then add a catchall default_qr_algorithm(::AbstractMatrix)? I assume there is no way to exclude anything that actually lives on the GPU?

Finally, regarding AD: I assume that our default setup will already automatically select our custom pullback rules also for these native QR and LQ implementations. How difficult is it to circumvent these registered custom pullback rules and switch to native AD'ing, e.g. for testing purposes?

@kshyatt
Copy link
Member

kshyatt commented Nov 3, 2025

Do people ever want to do non-native scalar types on GPUs?

Well, Sander might ;). But people definitely do Float16 and in theory one might want to do Float128 types even if arbitrary precision won't work because it may not be isbits.

@lkdvos
Copy link
Member

lkdvos commented Nov 3, 2025

Since we are intercepting the AD at the level of qr_compact!(A, F, alg), we could just try and AD through _native_qr, which won't be intercepted, if the goal is to just test things out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants