Skip to content

Fix SimpleNonlinearSolve for GPU compatibility#809

Open
utkarsh530 wants to merge 10 commits intomasterfrom
utkarsh530-patch-1
Open

Fix SimpleNonlinearSolve for GPU compatibility#809
utkarsh530 wants to merge 10 commits intomasterfrom
utkarsh530-patch-1

Conversation

@utkarsh530
Copy link
Member

@utkarsh530 utkarsh530 commented Feb 10, 2026

Here are some fixes I made to enable GPU compatibility for SimpleNonlinearSolve.jl. This is required for https://github.com/SciML/ParallelParticleSwarms.jl hybrid algorithms.

  • Make SimpleNonlinearSolve algorithms GPU kernel-compatible by eliminating runtime type checks that cause
  • TypeError or InvalidIRError during CUDA kernel compilation
  • Convert runtime SciMLBase.isinplace() conditionals to @generated functions that resolve at compile time via type parameters
  • Standardize NonlinearAliasSpecifier API across all algorithms with a simplified, GPU-compatible pattern
  • Centralize shared GPU-compatible helpers (get_alias_u0, should_cache_fx) in Utils module
  • Simplify maybe_unaliased to a single method to avoid dynamic dispatch on GPU
  • Exclude SimpleHalley from kernel launch tests (requires LU factorization and second derivatives which are fundamentally incompatible with GPU kernels)

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the
    contributor guidelines, in particular the SciML Style Guide and
    COLPRAC.
  • Any new documentation only uses public API

Additional context

Add any other context about the problem here.

@utkarsh530
Copy link
Member Author

@ChrisRackauckas can you have a look?

@utkarsh530 utkarsh530 changed the title Test NonlinearSolve on the GPU Fix SimpleNonlinearSolve for GPU compatibility Feb 11, 2026
f(fu, u, p)
return fu

@generated function evaluate_f!!(f::NonlinearFunction, fu, u, p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this being a generated function have huge compile time impact?

@ChrisRackauckas-Claude
Copy link
Contributor

Review: Unnecessary @generated functions

SciMLBase.isinplace simply extracts a type parameter (iip) and returns it directly:

isinplace(prob::AbstractNonlinearProblem{uType, iip}) where {uType, iip} = iip
isinplace(f::AbstractSciMLFunction{iip}) where {iip} = iip

The compiler constant-folds this to true or false at compile time. @code_typed confirms the function body is literally return false or return true after inference. There is zero runtime work — the dead branch in if SciMLBase.isinplace(prob) gets eliminated by the compiler.

Therefore, none of these needed to be @generated:

  1. incompatible_backend_and_problem (NonlinearSolveBase/src/autodiff.jl:108) — reverted to regular function
  2. evaluate_f!! (NonlinearSolveBase/src/utils.jl:177) — reverted to regular function
  3. evaluate_f (NonlinearSolveBase/src/utils.jl:191) — reverted to regular function
  4. should_cache_fx (SimpleNonlinearSolve/src/utils.jl:22) — reverted to @inline function
  5. prepare_jacobian (SimpleNonlinearSolve/src/utils.jl:101) — reverted to regular function

Also removed the unnecessary AbstractNonlinearFunction import that was only needed for the @generated pattern.

The @assert in compute_jacobian!! is left commented out — while isinplace infers fine, @assert generates a throw which is genuinely incompatible with GPU kernels (GPUCompiler may not reliably dead-code-eliminate it).

Tests pass:

  • NonlinearSolveBase: 16/16 ✓
  • SimpleNonlinearSolve: 35,584/35,588 ✓ (4 pre-existing broken)

Fix commit: ChrisRackauckas-Claude/NonlinearSolve.jl@2239d074

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants