[DR] Update Poisson/Helmoltz matrix

We wish to update the LHS matrix in the Poisson/Helmoltz solver used in gyrokinetics (fem_poisson_perp). Here's a path to do it.

On the CPU
------------

We simply use the mat_triples approach used in the `fem_poisson_perp_new` method. The only unresolved issue is that we need to experiment with doing this without destroying and creating (again) the SuperMatrix `A`. I think it should be possible, but I see some comments in the source code saying that I tried this some years ago, but it errored out.

On the GPU
------------

The incoming `epsilon` and `kSq` arrays are assumed to be on GPU memory. We can't use the `mat_triples` approach because this struct is not defined on GPUs, and it'll take a lot of work to move all that stuff to GPU. So here we propose the following approach: fill the (flat) array of nonzero values `csr_val_cu` that cudss.cu uses to populate the LHS matrix.

The challenge here is that we have kernels (`l2g`) that compute the global `i,j` indices of a given nonzero-value **contribution** in the LHS matrix. We do not now a priori how the `i,j` entry maps to the flat array of nonzero values `csr_val_cu` (which has size `nprob*nnz`, and here `nprob` is just the number of cells along z, and `nnz` is the number of nonzero values of the problem in a single z-slab).

So we need the mapping
global i,j -> linear index in `csr_val_cu`

In order to get this mapping:
1. We continue to populate the `mat_triples` on the CPU in `fem_poisson_perp_new`.
2. Once we know the list of `mat_triples`, we loop through the grid and:
  2a) At each cell, we use the `l2g` kernel to get the global indices `i,j`.
  2b) Each cell makes `num_basis^2` contributions. For each contribution we loop through the list of `mat_triples`, and we find the entry with the same row and column indices as `i,j`. We record the (linear) location of this entry in the `csr_val_idx` array.

Then, in the `fem_poisson_perp_update_lhs` method, we can call the `lhs_stencil` kernel but rather than filling a `mat_triples`, we will have it directly insert a value in/accumulate a value to `cudss_ops->csr_val_cu[csr_val_idx[k]]`.

NOTE: one ugly thing is that step 2b above involves a search, and we do this search for each cell and each of `num_basis^2` contributions. To make matters worse, at the moment we are doing a brute-force search. So this is expensive. But we only do this at t=0, so hopefully is ok.

This approach is prototyped in the `fem_poisson_update-prototyping` branch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DR] Update Poisson/Helmoltz matrix #977

On the CPU

On the GPU

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[DR] Update Poisson/Helmoltz matrix #977

Description

On the CPU

On the GPU

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions