Skip to content

Conversation

ChrisRackauckas
Copy link
Member

Summary

This PR refactors the CudaOffloadFactorization algorithm into two separate algorithms to provide more control over the factorization method used for GPU offloading:

  • CudaOffloadLUFactorization - Uses LU factorization
  • CudaOffloadQRFactorization - Uses QR factorization

The original CudaOffloadFactorization is deprecated with a warning and forwards to the QR version for backward compatibility.

Changes

New Algorithms

  • CudaOffloadLUFactorization: GPU-accelerated solver using LU factorization
  • CudaOffloadQRFactorization: GPU-accelerated solver using QR factorization

Updated Files

  • src/extension_algs.jl: Added new struct definitions for both algorithms, deprecated the original
  • src/LinearSolve.jl: Exported the new algorithms
  • ext/LinearSolveCUDAExt.jl: Implemented solve methods for both new algorithms using appropriate factorizations
  • lib/LinearSolveAutotune/src/algorithms.jl: Updated to use LU version for better performance
  • test/gpu/cuda.jl: Added tests for both new algorithms
  • test/resolve.jl: Updated to include new algorithms in testing

Deprecation

  • CudaOffloadFactorization now shows a deprecation warning: "CudaOffloadFactorization is deprecated, use CudaOffloadQRFactorization instead"
  • The deprecated version maintains backward compatibility by using QR factorization

Test plan

  • Verify both new algorithms work correctly with CUDA.jl loaded
  • Verify deprecation warning appears for old algorithm
  • Run existing GPU tests with both factorization methods
  • Verify LinearSolveAutotune correctly uses the LU version

🤖 Generated with Claude Code

- Created CudaOffloadLUFactorization using lu factorization
- Created CudaOffloadQRFactorization using qr factorization
- Deprecated CudaOffloadFactorization to use QR (with deprecation warning)
- Updated CUDA extension to implement both algorithms
- Updated LinearSolveAutotune to use LU version for better performance
- Updated tests to include both new algorithms
- Exported both new algorithms from LinearSolve module
- Fixed namespace issues (removed LinearSolve. prefix)
- Fixed constructor syntax (new() instead of new{}())
- Added debug comment
- Updated exports to separate lines

Note: The new types are defined correctly but there appears to be
a precompilation caching issue preventing them from being recognized
immediately. A clean rebuild may be required.
- Updated GPU tutorial to show new CudaOffloadLUFactorization/QRFactorization
- Updated solver documentation to explain both algorithms
- Added deprecation warning in documentation
- Updated release notes with upcoming changes
- Created example demonstrating usage of both new algorithms
- Explained when to use each algorithm (LU for performance, QR for stability)
@ChrisRackauckas ChrisRackauckas merged commit f97754c into SciML:main Aug 10, 2025
104 of 119 checks passed
@ChrisRackauckas ChrisRackauckas deleted the refactor-cuda-offload-factorizations branch August 10, 2025 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant