Complete native binary triangular solve for MKL and BLIS #667
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Complete the triangular solve portion for directly wrapped binaries (MKL and BLIS) to use native LAPACK calls instead of falling back to libblastrampoline. AppleAccelerate was already correctly implemented.
Problem
Previously, the MKL and BLIS LU factorization wrappers were incomplete:
getrf\!
): Used native binary callsldiv\!
): Fell back to Julia'sldiv\!
→ libblastrampolineThis meant that despite having direct native binary access, the solve step still went through the generic Julia Linear Algebra stack.
Solution
MKL (
src/mkl.jl
)ldiv\!(cache.u, factorization, cache.b)
with directgetrs\!
callsgetrs\!
functions that were already implemented but unusedBLIS (
ext/LinearSolveBLISExt.jl
)ldiv\!(cache.u, factorization, cache.b)
with directgetrs\!
callsgetrs\!
functions via BLIS that were already implemented but unusedReturnCode
importAppleAccelerate (
src/appleaccelerate.jl
)aa_getrs\!
callsKey Changes
MKL solve method:
BLIS solve method:
Benefits
getrs\!
implementationsTest Results
Checklist
getrs\!
callsgetrs\!
calls via BLISaa_getrs\!
calls🤖 Generated with Claude Code