-
-
Notifications
You must be signed in to change notification settings - Fork 72
Refactor CudaOffloadFactorization into LU and QR variants #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ChrisRackauckas
merged 12 commits into
SciML:main
from
ChrisRackauckas:refactor-cuda-offload-factorizations
Aug 10, 2025
Merged
Changes from 7 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
4814de7
Refactor CudaOffloadFactorization into LU and QR variants
ChrisRackauckas ce4671e
Fix syntax issues in CudaOffload factorizations
ChrisRackauckas 52b3307
Update ext/LinearSolveCUDAExt.jl
ChrisRackauckas 3a508d7
Update src/extension_algs.jl
ChrisRackauckas be4cd19
Update src/extension_algs.jl
ChrisRackauckas 57fee72
Update test/gpu/cuda.jl
ChrisRackauckas e775de6
Update documentation for CudaOffload factorization changes
ChrisRackauckas eb0930f
Update docs/src/solvers/solvers.md
ChrisRackauckas ad039a1
Delete examples/cuda_offload_example.jl
ChrisRackauckas 9c67b43
Update ext/LinearSolveCUDAExt.jl
ChrisRackauckas 75d6546
Merge branch 'main' into refactor-cuda-offload-factorizations
ChrisRackauckas f6bdcb3
Update ext/LinearSolveCUDAExt.jl
ChrisRackauckas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
""" | ||
Example demonstrating the new CudaOffloadLUFactorization and CudaOffloadQRFactorization algorithms. | ||
|
||
This example shows how to use the new GPU offloading algorithms for solving linear systems | ||
with different numerical properties. | ||
""" | ||
|
||
using LinearSolve | ||
using LinearAlgebra | ||
using Random | ||
|
||
# Set random seed for reproducibility | ||
Random.seed!(123) | ||
|
||
println("CUDA Offload Factorization Examples") | ||
println("=" ^ 40) | ||
|
||
# Create a well-conditioned problem | ||
println("\n1. Well-conditioned problem (condition number ≈ 10)") | ||
A_good = rand(100, 100) | ||
A_good = A_good + 10I # Make it well-conditioned | ||
b_good = rand(100) | ||
prob_good = LinearProblem(A_good, b_good) | ||
|
||
println(" Matrix size: $(size(A_good))") | ||
println(" Condition number: $(cond(A_good))") | ||
|
||
# Try to use CUDA if available | ||
try | ||
using CUDA | ||
|
||
# Solve with LU (faster for well-conditioned) | ||
println("\n Solving with CudaOffloadLUFactorization...") | ||
sol_lu = solve(prob_good, CudaOffloadLUFactorization()) | ||
println(" Solution norm: $(norm(sol_lu.u))") | ||
println(" Residual norm: $(norm(A_good * sol_lu.u - b_good))") | ||
|
||
# Solve with QR (more stable) | ||
println("\n Solving with CudaOffloadQRFactorization...") | ||
sol_qr = solve(prob_good, CudaOffloadQRFactorization()) | ||
println(" Solution norm: $(norm(sol_qr.u))") | ||
println(" Residual norm: $(norm(A_good * sol_qr.u - b_good))") | ||
|
||
catch e | ||
println("\n Note: CUDA.jl is not loaded. To use GPU offloading:") | ||
println(" 1. Install CUDA.jl: using Pkg; Pkg.add(\"CUDA\")") | ||
println(" 2. Add 'using CUDA' before running this example") | ||
println(" 3. Ensure you have a CUDA-compatible NVIDIA GPU") | ||
end | ||
|
||
# Create an ill-conditioned problem | ||
println("\n2. Ill-conditioned problem (condition number ≈ 1e6)") | ||
A_bad = rand(50, 50) | ||
# Make it ill-conditioned | ||
U, S, V = svd(A_bad) | ||
S[end] = S[1] / 1e6 # Create large condition number | ||
A_bad = U * Diagonal(S) * V' | ||
b_bad = rand(50) | ||
prob_bad = LinearProblem(A_bad, b_bad) | ||
|
||
println(" Matrix size: $(size(A_bad))") | ||
println(" Condition number: $(cond(A_bad))") | ||
|
||
try | ||
using CUDA | ||
|
||
# For ill-conditioned problems, QR is typically more stable | ||
println("\n Solving with CudaOffloadQRFactorization (recommended for ill-conditioned)...") | ||
sol_qr_bad = solve(prob_bad, CudaOffloadQRFactorization()) | ||
println(" Solution norm: $(norm(sol_qr_bad.u))") | ||
println(" Residual norm: $(norm(A_bad * sol_qr_bad.u - b_bad))") | ||
|
||
println("\n Solving with CudaOffloadLUFactorization (may be less stable)...") | ||
sol_lu_bad = solve(prob_bad, CudaOffloadLUFactorization()) | ||
println(" Solution norm: $(norm(sol_lu_bad.u))") | ||
println(" Residual norm: $(norm(A_bad * sol_lu_bad.u - b_bad))") | ||
|
||
catch e | ||
println("\n Skipping GPU tests (CUDA not available)") | ||
end | ||
|
||
# Demonstrate the deprecation warning | ||
println("\n3. Testing deprecated CudaOffloadFactorization") | ||
try | ||
using CUDA | ||
println(" Creating deprecated CudaOffloadFactorization...") | ||
alg = CudaOffloadFactorization() # This will show a deprecation warning | ||
println(" The deprecated algorithm still works but shows a warning above") | ||
catch e | ||
println(" Skipping deprecation test (CUDA not available)") | ||
end | ||
|
||
println("\n" * "=" ^ 40) | ||
println("Summary:") | ||
println("- Use CudaOffloadLUFactorization for well-conditioned problems (faster)") | ||
println("- Use CudaOffloadQRFactorization for ill-conditioned problems (more stable)") | ||
println("- The old CudaOffloadFactorization is deprecated") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.