-
-
Notifications
You must be signed in to change notification settings - Fork 38
Resolve Metal.jl type instability for saveat and literals; Metal and AMDGPU build fix #381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes type instability errors (`InvalidIRError: unsupported use of double value`) encountered when using `EnsembleGPUKernel` on the `Metal.jl` backend (Apple M-series GPUs), specifically: 1. Issue SciML#379: Corrects the handling of the `saveat` argument in `src/ensemblegpukernel/lowerlevel_solve.jl`. Converts `saveat` (whether a `Number`, `AbstractRange`, or `AbstractArray`) into a `Vector{Tt}` matching the problem's time type (`Tt = eltype(prob.tspan)`) before passing it to the GPU kernel. This prevents the internal `Float64` fields within `StepRangeLen` from causing compilation errors. Includes edge case handling for `saveat=0.0`. 2. Related Literal Fix: Wraps hardcoded numeric literals (e.g., `1e-7`, `1e-14`) with `Tt(...)` in various `src/ensemblegpukernel/perform_step/` files (like `gpu_em_perform_step.jl`, `gpu_tsit5_perform_step.jl`, etc.) to ensure type consistency within the GPU kernels, addressing issues similar to those discussed on Zulip/Slack. These changes ensure robust type handling for `Float32` problems on backends without `Float64` support, while remaining compatible with standard `Float64` usage on CUDA and CPU.
Fix: Resolve Metal.jl type instability for saveat and literals Fixes type instability errors (`InvalidIRError: unsupported use of double value`) encountered when using `EnsembleGPUKernel` on the `Metal.jl` backend (Apple M-series GPUs), specifically: 1. Issue SciML#379: Corrects the handling of the `saveat` argument in `src/ensemblegpukernel/lowerlevel_solve.jl`. Converts `saveat` (whether a `Number`, `AbstractRange`, or `AbstractArray`) into a `Vector{Tt}` matching the problem's time type (`Tt = eltype(prob.tspan)`) before passing it to the GPU kernel. This prevents the internal `Float64` fields within `StepRangeLen` from causing compilation errors. Includes edge case handling for `saveat=0.0`. 2. Related Literal Fix: Wraps hardcoded numeric literals (e.g., `1e-7`, `1e-14`) with `Tt(...)` in various `src/ensemblegpukernel/perform_step/` files (like `gpu_em_perform_step.jl`, `gpu_tsit5_perform_step.jl`, etc.) to ensure type consistency within the GPU kernels, addressing issues similar to those discussed on Zulip/Slack. These changes ensure robust type handling for `Float32` problems on backends without `Float64` support, while remaining compatible with standard `Float64` usage on CUDA and CPU. Fixes SciML#379
Initializes `saveat_converted = nothing` in both `vectorized_solve` functions (for ODEs and SDEs) to fix an `UndefVarError`. Also adds the missing `adapt(backend, ...)` call in the SDE function to move the converted `saveat` vector to the GPU.
|
Hello! The latest CI run shows that Metal and AMDGPU tests are now passing, confirming that the fixes in this branch successfully resolve the backend-related issues. Thank you for your time and for maintaining this package! — Ambar |
Updates the `saveat` range tests to compare the result against a `collect`ed Vector (Vector{Tt}) instead of a StepRangeLen. This matches the new, correct output type from the `saveat` fix.
This reverts commit e557187.
099a9ac to
aac6f57
Compare
This PR fixes
InvalidIRError: unsupported use of double valuecrashes encountered when usingEnsembleGPUKernelon theMetal.jlbackend, which was originally reported in Issue #379.The fixes are in two main parts:
src/ensemblegpukernel/nlsolve/utils.jlto resolve issues related to Metal and AMDGPU.nlsolve Stiff Solver Bug: The unsupported dynamic function invocation error (which was failing AMDGPU and Metal) is fixed by replacing norm() with the GPU-safe diffeqgpunorm(..., t). The successful Metal, AMDGPU (and OneAPI Julia v1) builds show that the previous issues have been resolved.
2.
saveatType Instability (Fixes #379):* In
src/ensemblegpukernel/lowerlevel_solve.jl, thesaveatargument (when aNumberorAbstractRange) was being converted into aStepRangeLenwhich had internalFloat64fields, crashing theFloat32Metal kernel.* This PR changes the logic to convert
saveatinto aVector{Tt}(whereTtis the problem's time type) before passing it to the kernel.* This also includes fixes for
UndefVarError(by initializingsaveat_converted = nothing) and adds the missingadapt(backend, ...)call to the SDE solver.Numeric Literal Instability:
* In
src/ensemblegpukernel/perform_step/gpu_tsit5_perform_stepfile, hardcoded numeric literals (e.g.,1.0f-14) were used.* These have been wrapped with
T(...)(e.g.,T(1.0e-14)) to ensure they match the kernel's float type, preventing type instability.These changes ensure robust type handling for
Float32problems on backends withoutFloat64support, while remaining compatible with standardFloat64usage on CUDA and CPU.Note for Maintainers: I do not have access to Apple Metal hardware, so I was unable to test this fix locally. Requesting verification from a maintainer or user with a Metal-capable device.