Skip to content

Swap plane Poisson solver with slab Poisson solver in GK IWL simulations & 2x LBO energy conservation#928

Open
manauref wants to merge 61 commits intomainfrom
poisson_perp_bias
Open

Swap plane Poisson solver with slab Poisson solver in GK IWL simulations & 2x LBO energy conservation#928
manauref wants to merge 61 commits intomainfrom
poisson_perp_bias

Conversation

@manauref
Copy link
Copy Markdown
Collaborator

@manauref manauref commented Jan 7, 2026

This goes with PR 93 of gkylcas

Problem

Currently the IWL simulations are using the Poisson solver on planes (fem_poisson_deflated). This has been shown to break energy conservation. One reason we have continued to do that is that:
a) We didn't have the ability to bias the limiter corner in the slab Poisson solver (fem_poisson_perp).
b) We didn't have a working workflow that used the slab Poisson solver and twist-shift BCs (TS BCs).

Here we address both of these points, and deprecate the use of fem_poisson_deflated in favor of fem_poisson_perp.

Solution

Biasing

We implemented the ability to bias the limiter corner in fem_poisson_perp. Some info is in the DR #898 .

Unlike biasing in fem_poisson, where can bias whole planes in the solver, in fem_poisson_perp we bias lines. That's at the moment we just need to bias the lines at (x,z)=(x_LCFS,z_min) and (x,z)=(x_LCFS,z_max).

Tests with biasing were added to the fem_poisson_perp unit test.

Workflow with TS BCs

We implemented a new set of steps to make sure the potential is twist-shift periodic with this new slab solver. These are:

  1. Smooth the charge density, without BCs (fem_parproj).
  2. Solve the perpendicular Poisson problem (fem_poisson_perp).
  3. In the core, apply TS BC to phi at the lower z-boundary, and a ghost-from-skin-surf BC at the upper z-boundary.
  4. Smooth phi, with Dirichlet BCs that take the Dirichlet value from the ghost in the core, and witih Dirichlet BCs that take the Dirichlet value from the skin in the SOL.

Commentary on alternatives

  • Regarding step 1: It's unclear whether BCs are needed here, or what they should be. In SOL simulations we don't apply BCs, and that's the option that makes the operator self-adjoint and preserves energy. Perhaps indirectly using TS BCs as done in steps 3-4 would be worthwhile, but doesn't seem necessary so far.
  • Note that steps 3-4 only apply TS BC at the lower boundary. This is as it's done in main now. We tried, both in main and in this branch, but applying TS BC only at the upper boundary or in combination with the lower, does not work.

Additional perks

We noticed that the LBO collision operator was not conserving energy in 2x. We fixed that in this branch.

This fix is not the final one, though. Through more testing I found scenarios (large gradients + normNu) in which the LBO still doesn't conserve energy. I think we need to compute nu-weighted moments to truly conserve energy, which we may pursue in another branch.

Tests

This branch is valgrind and compute-sanitizer clean.

Regression tests

Regression tests run, but they are expected to yield slightly different solutions since we are using a different algorithm now.

rt_gk_d3d_iwl_2x2v_p1

Screenshot 2026-01-07 at 10 20 06 AM Screenshot 2026-01-07 at 10 18 13 AM Screenshot 2026-01-07 at 10 20 59 AM

rt_gk_tcv_iwl_adapt_source_2x2v_p1

Screenshot 2026-01-07 at 10 23 17 AM Screenshot 2026-01-07 at 10 24 20 AM Screenshot 2026-01-07 at 10 25 08 AM

rt_gk_d3d_iwl_3x2v_p1

Screenshot 2026-01-07 at 10 26 20 AM Screenshot 2026-01-07 at 10 28 07 AM Screenshot 2026-01-07 at 10 29 35 AM

rt_gk_tcv_iwl_adapt_source_3x2v_p1

Screenshot 2026-01-07 at 10 30 38 AM Screenshot 2026-01-07 at 10 32 36 AM Screenshot 2026-01-07 at 10 31 29 AM

Production simulation

We ran a TCV case provided by @Antoinehoff in both main and this branch. Here are some snapshots.

Screenshot 2026-01-07 at 9 35 48 PM Screenshot 2026-01-07 at 9 35 08 PM Screenshot 2026-01-07 at 9 34 09 PM Screenshot 2026-01-07 at 9 33 33 PM Screenshot 2026-01-07 at 9 33 00 PM Screenshot 2026-01-07 at 9 31 53 PM Screenshot 2026-01-07 at 9 31 35 PM

manauref and others added 30 commits November 10, 2025 12:38
…on_perp. Unlike the implementation in fem_poisson where one can specify an arbitrary (but aligned with the grid) biased plane, here we specify biased lines, but for the moment it is only restricted to specifyin a line parallel to y (perpendicular to x and z). Not yet tested.
…estricted to a line perpendicular to x and z. Add a unit test, which suggests it seems to work (gkylcas 2c7f34696aa65e552572369d203c031edc2146b4).
… Add correct logic for parallel smoothing in IWL to the GK field app. Regression tests results are qualitatively and quantiatively similar. A production run will give more confidence.
… a) the 1x and 2x volume kernels were out of date (I didn't change the maxima at all to generate these new kernels), b) When int/corn/surf geo was implemented we used the interior bmag_inv in the GK LBO when we should've used the corner bmag_inv. So here we remove geo_int.bmag_inv (and geo_int.bmag_inv_sq because it's not used anywhere) and add geo_corn.bmag_inv. Now the LBO conserves energy when B(x) and n(x),T(x).
…sed line in fem_poisson_perp; it wasn't working right when the biased line was placed on the domain boundary. It behaves as expected now, but the new logic isn't totally compatible with TS BCs. That's because the second smoothing changes phi at the biased line (it's no longer 0 if the biased value was 0), so the TS BC of phi will have a non-trivial effect there. Perhaps the thing to do is to apply TS BC to phi before the second smoothing, and used the TS-ed phi as a BC in the smoothing.
…ter the perp solve, apply TS BC to phi on at both boundaries. b) Smooth phi along z with Dirichlet BCs both in the core and the SOL, otherwise the biased limiter value is not preserved (an alternative could be to change fem_parproj so it enforces a biased line value just at the limiter corner in the SOL smoothing).
…values are read from the ghost cell, and adapt the unit test which passes now. Really we should make this an option in the updater because we've gone back and forth on this a couple of times over the last 2 years. For now we simply change this option entirely, for testing (gkylcas 3607635a8b0e4da7d08516021bfe53b72cbd86ab).
…ost cells in the wrong place. We need to fill the ghost cells of the z-global array that goes into fem_parproj. It's currently a bit hacky here, but the simulation appears well behaved. We are presently only applying TS at the lower boundary of the core; if we apply TS on both sides, the solution develops weird gradients in z and the sim crashes in 2-3 micros
…t, and I want to check that it gives similar results (I still don't understand why applying TS at both boundaries behaves so badly).
…ranch with: 1) flattening of the solution in the core ghost before applying TS (not yet tested), 2) some memory fixes in fem_poisson_perp biasing. But I'm seeing cases where biasing works at the bottom but not at the top, and want to switch to laptop for more rapid testing
…eld to previous state (without the flattening of the ghost before TS, we'll test that after).
…r the ghost cell (since we've gone back and forth between the two, and likely different operations will need one or the other in the future). Adapt unit test so it tests both options. Unit tests pass on CPU, not yet tested on GPU.
…pass on CPU and GPU. Unit test is compute-sanitizer clean.
…per core boundary. Now rt_gk_d3d_iwl_3x2v_p1 is valgrind clean.
…c app, where all core/sol ranges are created. Rename skin/ghost ranges in gyrokinetic and species apps to include local_ so they are consistent with their global counterpart. Checked the following regression tests and they all had unchanged results:

rt_gk_mirror_boltz_elc_1x2v_p1
rt_gk_multib_asdex_2x2v_p1
rt_gk_multib_step_2x2v_p1
rt_gk_neut_recycle_1x3v_p1
rt_gk_sheath_1x2v_p1
rt_gk_sheath_2x2v_p1
rt_gk_sheath_3x2v_p1
rt_gk_sheath_fluid_neut_1x2v_p1
rt_gk_tcv_iwl_adapt_source_2x2v_p1
rt_gk_tcv_iwl_adapt_source_3x2v_p1
@Antoinehoff
Copy link
Copy Markdown
Collaborator

Antoinehoff commented Feb 10, 2026

Here are one-to-one comparison of production like simulations between the current state of main (5506bbe) and this branch (8fee1f5). The simulation setup is GK TCV iwl 3x2v, see rt_gk_tcv_iwl_adapt_source_3x2v_p1.c, for a coarse resolution (24x16x12x12x6) as presented in Hoffmann et al. 2026. The only parameter alterations are an increase of the input power, 0.5MW instead of ~0.25MW, to accelerate turbulence development, and a reduction of the collisionality scaling factor, $\nu_{frac}=0.5$ instead of $1.0$, to increase the time step.
The results from the main and this branch are labelled main and poissonperp, respectively.

Edit: I updated the plots and analysis with longer runs as a noticeable difference rise when approaching the quasi-steady state.

Note: The poisson perp run crashed at t~1320mus, the main run did not reach that time yet so it is unclear if the crash is solely due to the solver. I've observed that reducing nu_frac can destabilize the simulation at longer time.

General numerics comments

The poissonperp run presents a very stable time step ~4.6ns against a fluctuating one for the main run, ~3 to 4.6ns. The poissonperp simulation is consequently faster, achieving ~400mus against 330mus for main, in 6h and 4 GPUs.

Energy

It seems that the poissonperp leads to a lower confined energy state.
We see a strong reduction of the ion integrated Hamiltonian because of the global reduction of the potential value. This does not affect the total Hamiltonian. However, the electron Hamiltonian is lower in poissonperp than in main despite the stronger negativity of the potential, which indicates that the thermal energy of the electrons is strongly reduced in the poissonperp.
main
image
poissonperp
image

Potential

Poisson perp saturates into a lower potential values everywhere, increasing the ExB shear at the inner radial boundary and decreasing tit at the outer one.
main
image
poissonperp
image

Density

Fairly similar, no comments.
main
image
poissonperp
image

Temperature

The electron quasi-steady state temperature is remarkably reduced with the Poisson perp solver as it is expected from the energy analysis above. The SOL electron temperature matches better with experiment but it is unclear for the core (recall that the initial conditions of this simulation is ~ the experimental profile measurement). For the ions, a stronger temperature "well" is observed in the Poisson perp simulation.
main
image
poissonperp
image

Density fluctuations

We see that the fluctuations at the inner radial boundary are reduced in the poisson perp case, which may be due to the increase ExB shear in that region (potential well is larger in poisson perp).
main
ne_fluct_main__n_frames_400_to_499
poissonperp
ne_fluct_poiperp__n_frames_400_to_499

Temperature fluctuations

It seems that the y-flows are stronger in the main simulation.
main
TeTi_fluct_main__Te_frames_400_to_499
poissonperp
TeTi_fluct_poiperp__Te_frames_400_to_499

Copy link
Copy Markdown
Collaborator

@Antoinehoff Antoinehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great work!!

Copy link
Copy Markdown
Collaborator

@Antoinehoff Antoinehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer runs show significant discrepancies in the quasi steady state, who's right? The above analysis is updated.

@manauref
Copy link
Copy Markdown
Collaborator Author

Longer run are a bit worrying: everything looks like it is loosing energy. I think we need to see the steady state of this simulation Also I need help to plot the energy balance.

can you edit the comment adding the equivalent plots with the simulation run in main? i can message you privately about the energy balance problem

@Antoinehoff
Copy link
Copy Markdown
Collaborator

Longer run are a bit worrying: everything looks like it is loosing energy. I think we need to see the steady state of this simulation Also I need help to plot the energy balance.

can you edit the comment adding the equivalent plots with the simulation run in main? i can message you privately about the energy balance problem

I just restarted it, it is not as advanceed yet but here is how the integrated moments evolved for now. I'll update previous comment tomorrow.

image

@manauref
Copy link
Copy Markdown
Collaborator Author

@Antoinehoff one concern you had is that your energy conservation time trace looked worse in this branch, right? But the adaptive sources aim to conserve M2 energy, not H energy, right?

manauref added 14 commits March 15, 2026 16:18
…s of the LHS matrix on the GPU. The idea is to pack the nonzero values directly into the csr_val array in cudss_ops.cu, for which we'll need to pre-store the linear indices into this array. It'll take a bit of memory (exactly num_basis^2 * local_range.volume * sizeof(int)), but I couldn't think of a simpler way (gkylcas ae93c1e50ecfb936b4d5108866d3cd61b4b09771).
…_perp solvers so we can update the LHS matrix on the fly. Add a unit test of this, modifying existing tests to reduce code duplication. It passes the test on GPU. The CPU code is not ready; I see in superlu_ops.c that I tried to do this with the SuperLU solver in the past, but apparently it errored out, so I have to examine that again.
…so that we can update the LHS matrix without having to refactor (assuming same sparsity pattern). Now that I understand SuperLU better, I see why it was erroring out when I first tried this few years ago. We needed to use the expert driver (which the documentation doesn't make clear, but their source code gives hints of this). There may be some other memory savings/accelerations we can explore still, like not re-creating the B matrix, and setting the work memory only once. SuperLU and fem_poisson_perp unit tests pass, including new ones that update the A matrix.
…e are not destroying and creating the matrix every time. SuperLU and fem_poisson unit tests pass, but fem_poisson_perp occasionally hangs in the 2x periodic test, so I suspect there's a memory error somewhere. Unfortunately Perlmutter is down so I have no way of checking.
…ests would hang. It was just a bug in the unit test.
…once. @JunoRavin unlocked this, via use of Opus. In particular it suggested a Glu array (which I understand now looking at SuperLU's source code, and a particular calculation of the memory size needed (now in superlu_alloc_work_if_needed) which I don't understand where it comes from because the memory estimate I see in SuperLU's source code is different. But everything else I tried, including ways following SuperLU docs and source code comments, failed (either gave erroneous answers, seg faults, or was not valgrind clean), while this particular estimate of the memory needed seems to work for all unit tests, and is valgrind clean.
…g-fem_poisson_perp_bias

Merging "Update LHS matrix in fem_poisson_perp" into "Biasing in fem_poisson_perp"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants