Skip to content

[DR] Update Poisson/Helmoltz matrix #977

@manauref

Description

@manauref

We wish to update the LHS matrix in the Poisson/Helmoltz solver used in gyrokinetics (fem_poisson_perp). Here's a path to do it.

On the CPU

We simply use the mat_triples approach used in the fem_poisson_perp_new method. The only unresolved issue is that we need to experiment with doing this without destroying and creating (again) the SuperMatrix A. I think it should be possible, but I see some comments in the source code saying that I tried this some years ago, but it errored out.

On the GPU

The incoming epsilon and kSq arrays are assumed to be on GPU memory. We can't use the mat_triples approach because this struct is not defined on GPUs, and it'll take a lot of work to move all that stuff to GPU. So here we propose the following approach: fill the (flat) array of nonzero values csr_val_cu that cudss.cu uses to populate the LHS matrix.

The challenge here is that we have kernels (l2g) that compute the global i,j indices of a given nonzero-value contribution in the LHS matrix. We do not now a priori how the i,j entry maps to the flat array of nonzero values csr_val_cu (which has size nprob*nnz, and here nprob is just the number of cells along z, and nnz is the number of nonzero values of the problem in a single z-slab).

So we need the mapping
global i,j -> linear index in csr_val_cu

In order to get this mapping:

  1. We continue to populate the mat_triples on the CPU in fem_poisson_perp_new.
  2. Once we know the list of mat_triples, we loop through the grid and:
    2a) At each cell, we use the l2g kernel to get the global indices i,j.
    2b) Each cell makes num_basis^2 contributions. For each contribution we loop through the list of mat_triples, and we find the entry with the same row and column indices as i,j. We record the (linear) location of this entry in the csr_val_idx array.

Then, in the fem_poisson_perp_update_lhs method, we can call the lhs_stencil kernel but rather than filling a mat_triples, we will have it directly insert a value in/accumulate a value to cudss_ops->csr_val_cu[csr_val_idx[k]].

NOTE: one ugly thing is that step 2b above involves a search, and we do this search for each cell and each of num_basis^2 contributions. To make matters worse, at the moment we are doing a brute-force search. So this is expensive. But we only do this at t=0, so hopefully is ok.

This approach is prototyped in the fem_poisson_update-prototyping branch.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions