Fix CUDA V100 compatibility with LocalPreferences.toml by ChrisRackauckas-Claude · Pull Request #868 · SciML/NonlinearSolve.jl

ChrisRackauckas-Claude · 2026-03-19T12:43:16Z

Summary

This PR fixes the CUDA test failures on V100 runners (demeter3/demeter4) that were caused by CUDA driver version compatibility issues.

Problem

The V100 GPUs on demeter4 have a CUDA driver version (570.172) that reports an NVML version mismatch with newer CUDA toolkit versions. Additionally, CUDA_Driver_jll v13+ drops support for compute capability 7.0 (V100).

The tests were failing with:

CUDA error: unknown error (code 999, ERROR_UNKNOWN)

Solution

Following the pattern established in OrdinaryDiffEq.jl PR #3162:

Added LocalPreferences.toml files in both the root directory and lib/SimpleNonlinearSolve/ to:
- Pin CUDA Runtime to version 12.6
- Disable the forward-compat driver (compat = "false")
Updated test/runtests.jl and lib/SimpleNonlinearSolve/test/runtests.jl to add CUDA_Driver_jll and CUDA_Runtime_jll packages when running CUDA tests, so the LocalPreferences are properly loaded.

Files Changed

LocalPreferences.toml (new)
lib/SimpleNonlinearSolve/LocalPreferences.toml (new)
test/runtests.jl
lib/SimpleNonlinearSolve/test/runtests.jl

References

Fixes: ChrisRackauckas/InternalJunk#24
Pattern from: Replace buildkite with self-hosted GH Actions runners OrdinaryDiffEq.jl#3162
Related infrastructure: ChrisRackauckas/InternalJunk#19

This fixes the CUDA test failures on V100 runners (demeter3/demeter4) by: 1. Adding LocalPreferences.toml files to pin CUDA Runtime to v12.6 and disable the forward-compat driver (CUDA_Driver_jll v13+ drops V100 compute capability 7.0 support) 2. Updating test runtests.jl to add CUDA_Driver_jll and CUDA_Runtime_jll packages so the preferences are properly loaded Fixes: ChrisRackauckas/InternalJunk#24

…ation ForwardDiff uses scalar indexing which triggers GPU scalar indexing errors. Use AutoFiniteDiff which works with CUDA arrays.

The default PolyAlgorithm uses AutoForwardDiff internally which does scalar indexing on GPU arrays. Since there's no way to pass autodiff to the default algorithm, remove it from GPU tests. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add `needs: nonlinearsolve-cuda` to SimpleNonlinearSolve CUDA job so they don't compete for GPU memory on the same V100 runner - Add GC.gc() and CUDA.reclaim() before kernel launch tests to free memory from prior test items on shared GPU runners Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The `needs` dependency doesn't help since other jobs on the shared runner contribute to GPU memory pressure regardless. The CUDA.reclaim() fix in the test file is the actual mitigation. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The OOM was transient (other processes on shared runner), not caused by our tests. The gpu-v100 tag already targets the right runners. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ChrisRackauckas and others added 7 commits March 19, 2026 08:42

Fix CUDA tests: use AutoFiniteDiff for GPU-compatible Jacobian comput…

eedfb94

…ation ForwardDiff uses scalar indexing which triggers GPU scalar indexing errors. Use AutoFiniteDiff which works with CUDA arrays.

Retrigger CI: SimpleNonlinearSolve CUDA had transient OOM

26489ab

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ChrisRackauckas merged commit 19cd72e into SciML:master Mar 20, 2026
61 of 65 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix CUDA V100 compatibility with LocalPreferences.toml#868

Fix CUDA V100 compatibility with LocalPreferences.toml#868
ChrisRackauckas merged 7 commits intoSciML:masterfrom
ChrisRackauckas-Claude:fix-cuda-v100-compatibility

ChrisRackauckas-Claude commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ChrisRackauckas-Claude commented Mar 19, 2026

Summary

Problem

Solution

Files Changed

References

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants