Skip to content

alge: fix SIGSEGV in cs_sles_solve_ccc_fv with extended matrix columns#165

Open
diskdog wants to merge 1 commit into
code-saturne:masterfrom
diskdog:fix/sles-extended-buf-uvm
Open

alge: fix SIGSEGV in cs_sles_solve_ccc_fv with extended matrix columns#165
diskdog wants to merge 1 commit into
code-saturne:masterfrom
diskdog:fix/sles-extended-buf-uvm

Conversation

@diskdog
Copy link
Copy Markdown

@diskdog diskdog commented Jun 5, 2026

When n_cols_ext > n_cells_with_ghosts, cs_sles_solve_ccc_fv allocates
extended _vx and _rhs buffers using cs_alloc_mode_device, which resolves
to CS_ALLOC_DEVICE (device-only) in a standard CUDA build. Those pointers
are then passed into cs_sles_solve, which reads the residual on the host
during convergence checking. The result is a SIGSEGV.

The fix is to use CS_ALLOC_HOST_DEVICE_SHARED for these two buffers.
They are not pure-device scratch; the solver needs host-readable
convergence data from them. The GPU dispatch is unaffected: ctx still
runs on the GPU via set_use_gpu(true), and the unified-memory backing
is fast enough on all tested sm_7x+ devices.

The existing workaround (CS_CUDA_ALLOC_DEVICE_UVM=1) happens to fix
this by globally remapping cs_alloc_mode_device, but the global remap
affects unrelated allocations and masks the root cause here.

Tested on sm_75, CUDA 13.1, channel-flow case with CS_MATRIX_NATIVE.

Fixes #164.

When n_cols_ext > n_cells_with_ghosts, cs_sles_solve_ccc_fv allocates
extended _vx and _rhs buffers using cs_alloc_mode_device, which resolves
to CS_ALLOC_DEVICE (device-only) in a standard CUDA build. Those pointers
are then passed into cs_sles_solve, which reads the residual on the host
during convergence checking. The result is a SIGSEGV.

The fix is to use CS_ALLOC_HOST_DEVICE_SHARED for these two buffers.
They are not pure-device scratch; the solver needs host-readable
convergence data from them. The GPU dispatch is unaffected: ctx still
runs on the GPU via set_use_gpu(true), and the unified-memory backing
is fast enough on all tested sm_7x+ devices.

The existing workaround (CS_CUDA_ALLOC_DEVICE_UVM=1) happens to fix
this by globally remapping cs_alloc_mode_device, but the global remap
affects unrelated allocations and masks the root cause here.

Tested on sm_75, CUDA 13.1, channel-flow case with CS_MATRIX_NATIVE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SIGSEGV in cs_sles_solve_ccc_fv when CS_CUDA_ALLOC_DEVICE_UVM is not set (v9.1.0)

1 participant