Add TGLFNNukaeaTransportModel by theo-brown · Pull Request #1436 · google-deepmind/torax

theo-brown · 2025-08-12T13:45:22Z

Implementing TGLF-NN-UKAEA in TORAX.

theo-brown · 2025-08-19T10:37:56Z

Initial tests, TGLFNN (solid) vs QLKNN (dashed)

All tests run by modifying test_iterhybrid_predictor_corrector.py.
Note TGLFNNukaea training data may not cover ITER domains, so take with a pinch of salt.
In TGLFNNukaea, VEXB_SHEAR input set to 0 - I wasn't sure what to do with this, as I believe the expression for ExB shear either needs an E-field or a rotation rate, neither of which are present in TORAX.
TGLFNNukaea solid lines, QLKNN dashed lines.

Te only

Te, Ti

Te, Ti, psi

Te, Ti, psi, ne

Conclusions

Something wrong in calculation of particle diffusivities and convectivities from fluxes?
Thermal diffusivities seem pretty good, behaviour broadly matching QLKNN

jcitrin · 2025-08-20T19:24:42Z

In TGLFNNukaea, VEXB_SHEAR input set to 0 - I wasn't sure what to do with this, as I believe the expression for ExB shear either needs an E-field or a rotation rate, neither of which are present in TORAX.

This is fine for now. Rotation in TORAX is coming this Q3.

TGLFNNukaea solid lines, QLKNN dashed lines.

No need for Te only. It doesn't add much beyond Te+Ti
I think that the TGLF "transport barrier" at rho<0.4 (especially for Te+Ti+psi) is due to the negative magnetic shear. In QLKNN we reduce that impact with the avoid_big_negative_s flag which is True by default. I think this exasperates the difference. From comparisons to higher fidelity modelling we know that slab modes are underrepresented by QuaLiKiz which should still play a role at low/negative magnetic shear, hence this choice which is also in QuaLiKiz itself. So either we make something similar for TGLF and TGLFNN (if it's also relevant for TGLF), or we turn that flag off in QLKNN for the comparisons, or just do the comparison for a case without negative magnetic shear.
Particle transport. OMG. Yeah that doesn't look right

theo-brown · 2025-09-01T22:49:39Z

We found an issue with the version of the NN we were using. Things are making a bit more sense now, I believe. chi profiles looking broadly good. Will look at De, Ve / particle fluxes when I next have a chance - as the NN outputs main ion flux I think there just needs to be some reweighting to get the electron flux.

theo-brown · 2025-09-18T13:29:29Z

iterhybrid_predictor_corrector, with multimachine TGLFNNukaea vs QLKNN.

Getting reasonable matches for χₑ, χᵢ in ρ > 0.5.

Mismatch for 0.2 < ρ < 0.5 looks like difference in turbulent modes.

Mismatch for ρ < 0.2 not particularly relevant due to patch transport?

Any thoughts @lorenzozanisi @fcasson @jcitrin?

t=0

t=5

theo-brown · 2025-10-01T10:28:15Z

Progress update: been working on a toy case with @fcasson, we've now fixed various problems with the output normalisations and conversion to chis.

Next step is to make sure we're doing the right thing with particle transport.

Toy tokamak, initial condition, comparison between JETTO and TORAX

theo-brown · 2025-10-03T16:11:54Z

Comparison with JETTO, toy case, including particle transport and pellet source:

Initial condition: shows exact match in chi profiles

Final condition: shows close match in T, n. Mismatch in chi_e is most likely due to solver differences (e.g. numerical temporal smoothing/damping due to theta method)

pyproject.toml

theo-brown · 2025-10-03T16:18:55Z

torax/_src/transport_model/tglf_based_transport_model.py

+    )
+    chi_i = -P_i / (
+        core_profiles.n_i.face_value() * dT_i_drhon * geo.g1_over_vpr_face
+    )


The denormalisation of TGLF is slightly different to QuaLiKiz.

In particular, TGLF normalises everything with respect to the electron temperature and density, which means that chi_i can't be denormalised in the same way.

Due to the difficulties I had trying to track down bugs in this interface, I would strongly advocate for this sort of layout - denormalisation, conversion to power, conversion to chi - rather than what is done in QLKNN.

However, if there are any speed improvements that can be made by changing what is computed, they definitely take priority.

theo-brown · 2025-10-03T16:19:40Z

torax/_src/transport_model/tglf_based_transport_model.py

+    )
+    # For stability, we also set purely diffusive transport at some minimum
+    # threshold of the temperature gradient
+    D_eff_mask &= abs(tglf_inputs.lref_over_lne) >= transport.An_min


The D/V splitting could plausibly be reused from QLK, I just found it easier writing it out when debugging.

torax/_src/transport_model/tglfnn_ukaea_transport_model.py

torax/tests/test_data/test_iterhybrid_predictor_corrector_tglfnn_ukaea.py

theo-brown

Summarised some open questions / details. @jcitrin this is now ready for review :)

Thanks @fcasson for all your in-depth support in getting this over the line!

Some of these questions (eg memoisation and weights loading, which is currently the cause of the failing test) may also be worth revisiting with @hamelphi on fusion-surrogates. Currently fusion_surrogates[tglfnnukaea] requires installing PyTorch (!) because our weights are distributed as .pt or .onnx, and at the start of this process it was deemed preferable to reimplement the network in JAX (which means loading weights from Pytorch .pt) rather than rely on ONNX loaders that may not have a reliable maintenance pathway.

theo-brown · 2025-10-03T16:31:56Z

torax/tests/test_data/test_iterhybrid_predictor_corrector_tglfnn_ukaea.nc

Comparison with the base test

Yes that looks roughly as expected.

In general we see more impact of trapped electron drive in TGLF compared to QuaLiKiz. In lower-density reactor-relevant scenarios like the hybrid scenario, this will be more prevalent.

Intuitively what this means is while ITG dominated (see chi_i/chi_e>1) , the increased trapped electron drive should increase the density peaking as we go closer to the ITG-TEM boundary. This is indeed what is seen in TGLF.

The increased density peaking should decrease a bit the ITG turbulence threshold, resulting in the lower Ti. The higher density due to peaking should result in more Ti-Te coupling, reducing Te/Ti, which is also what we see.

theo-brown · 2025-10-17T09:48:20Z

As of this morning, proposed new method for distributing weights via pip installable package currently under review in internal UKAEA repos. This would massively simplify loading here and in fusion_surrogates.

pyproject.toml

theo-brown · 2025-10-19T12:31:04Z

@jcitrin @hamelphi updated to reflect google-deepmind/fusion_surrogates#32. This fixes the problems with Git LFS, and also simplifies the dependency tree.

Tests are failing until fusion_surrogates PR is merged and new version is published to PyPI. Tests pass on a local install, with the version of fusion surrogates from google-deepmind/fusion_surrogates#32.
Lmk when it's in and I'll manually retrigger the tests.

Changes to TGLFBasedTransportModel - Override _make_core_transport with TGLF-specific calculation - Correct output normalisations - Correct particle transport Other changes: - Update dependencies

jcitrin

Will start the internal review now.

Could we have a followup PR with documentation?

theo-brown · 2025-10-23T13:47:42Z

Forgot about docs! Added now.

jcitrin · 2025-10-23T14:01:04Z

Looking at the pyproject.toml: it looks like tglfnnukaea is not installed by default. Wouldn't it be easier for users to have it installed by default? It may be a workhorse model so users may be surprised if they don't have it from the getgo

hamelphi · 2025-10-23T15:04:18Z

Looking at the pyproject.toml: it looks like tglfnnukaea is not installed by default. Wouldn't it be easier for users to have it installed by default? It may be a workhorse model so users may be surprised if they don't have it from the getgo

Sounds good to me. We should keep in mind that this will add 60Mb to the deps. Though the current default install is around 1Gb (the largest chunk being jaxlib at 300Mb), so it is relatively small.

We could also make it default for fusion_surrogates.

theo-brown · 2025-10-23T15:18:42Z

Wouldn't it be easier for users to have it installed by default? It may be a workhorse model

Yep, not averse to this!

Before we do, it might be good to do some speed/performance testing, because I've found that some change over the last few commits has caused a big drop in performance. Could we have it as an optional dependency for the time being until it's a bit more stress-tested?

jcitrin · 2025-10-23T16:29:00Z

Yep, not averse to this! Before we do, it might be good to do some speed/performance testing, because I've found that some change over the last few commits has caused a big drop in performance. Could we have it as an optional dependency for the time being until it's a bit more stress-tested?

So shall we keep it as is for now, and then in a future PR make it default once the performance issues are solved?

theo-brown · 2025-10-23T16:32:07Z

Sounds good to me!

-- f6eaa06 by Theo Brown <7982453+theo-brown@users.noreply.github.com>: Add TGLFNNukaeaTransportModel Changes to TGLFBasedTransportModel - Override _make_core_transport with TGLF-specific calculation - Correct output normalisations - Correct particle transport Other changes: - Update dependencies -- a07351b by Theo Brown <7982453+theo-brown@users.noreply.github.com>: Switch to using fusion_surrogates==0.4.3 and tglfnn_ukaea Python package -- 3ffbc0f by Theo Brown <7982453+theo-brown@users.noreply.github.com>: Add documentation FUTURE_COPYBARA_INTEGRATE_REVIEW=#1436 from google-deepmind:tglfnn-ukaea 3ffbc0f PiperOrigin-RevId: 823087120

theo-brown · 2025-10-23T21:02:27Z

Timing comparison:

hamelphi · 2025-10-24T16:15:02Z

Thanks for the benchmark analysis.

I am not surprised that QLKNN_7_11 is faster than TGLFNN since it is a much smaller model. The specific model was picked with inference time as one of the metrics in mind. That also explains why GPUs do not help in that case, since I suppose the overhead is large compared to the inference time. GPUs could become more relevant for QLKNN with large batch inference.

On the flip side, I do expect the quality of TGLFNN predictions to be more accurate, or at least to have a closer correlation with the base model.

-- f6eaa06 by Theo Brown <7982453+theo-brown@users.noreply.github.com>: Add TGLFNNukaeaTransportModel Changes to TGLFBasedTransportModel - Override _make_core_transport with TGLF-specific calculation - Correct output normalisations - Correct particle transport Other changes: - Update dependencies -- a07351b by Theo Brown <7982453+theo-brown@users.noreply.github.com>: Switch to using fusion_surrogates==0.4.3 and tglfnn_ukaea Python package -- 3ffbc0f by Theo Brown <7982453+theo-brown@users.noreply.github.com>: Add documentation FUTURE_COPYBARA_INTEGRATE_REVIEW=#1436 from google-deepmind:tglfnn-ukaea 3ffbc0f PiperOrigin-RevId: 823087120

jcitrin · 2025-10-24T18:44:40Z

Congratulations @theo-brown , @lorenzozanisi ! It's a great milestone having this in TORAX. Looking forward to using it! Rotation will come soon.

@fcasson

lorenzozanisi · 2025-11-03T10:38:49Z

Thanks @theo-brown @hamelphi @fcasson and @jcitrin for pulling this together - excited to see the NNs in action within TORAX!

theo-brown force-pushed the tglfnn-ukaea branch 3 times, most recently from 9685829 to dba2acb Compare August 19, 2025 10:19

theo-brown mentioned this pull request Aug 19, 2025

Add TGLFBasedTransportModel #1417

Merged

theo-brown force-pushed the tglfnn-ukaea branch 3 times, most recently from 0c6356e to 7199b4e Compare August 27, 2025 10:52

theo-brown force-pushed the tglfnn-ukaea branch from 81ed5ee to b32109c Compare September 2, 2025 13:02

theo-brown force-pushed the tglfnn-ukaea branch from b32109c to 03f8582 Compare September 19, 2025 13:02

theo-brown force-pushed the tglfnn-ukaea branch 4 times, most recently from 0b8a368 to ee6cb6d Compare October 3, 2025 16:11

theo-brown marked this pull request as ready for review October 3, 2025 16:11