Research code for solving the forced advection-diffusion inverse problem for polar sea ice concentration.
On GitHub, this document renders in a way that is neither aesthetically pleasing nor easy to read.
The author's suggestion is to open this file in VSCode, with a Markdown renderer extension installed, or in another Markdown viewer, such as Obsidian.
While it is my hope that this summary of the methods and current work can always be complete, correct, and concise, it is the case that this page serves as a place to record ideas and thoughts on how to proceed, which, by their nature, can virtually never satisfies those desirable qualities.
If you spot any egregious errors, please open a bug with the details or your questions or a pull request with the correction.
I've attempted to maintain consistent notation throughout. However, in various places certain notations are more or less useful, and I possess not the foresight to use such notation throughout or it was unnecessarily cumbersome to use everywhere. In these places, I have used the notation I thought to minimize ambiguity about the topics discussed.
We're trying to understand ice flows in Earth's polar regions. NOAA and NASA publish large sea ice concentration datasets dating back nearly 50 years.
From observation, we believe the relevant physics to be governed by a forced advection-diffusion equation, like
where space
The problem may be called an anisotropic forced advection-diffusion problem.
This is a particularly notation-heavy discussion, so let's establish some quantities. This is going to seem exhaustive (and/or exhausting,) but the point here is to establish a pattern so we don't have to worry (too much) about the quantities considered.
Suppose we have the PDE,
To facilitate discussion of this PDE, we will adopt the following general notation: let some index set
and, generally for a subset
For any quantity (
Finally, there are at least two important subsets of
These will be useful when discussing boundary conditions and the enforcement of physics on the interior of the domain
For now, that's all that comes to mind for important quantities and notation, so we shall proceed with our discussion of the PINN framework.
For inverse problems, PINNs are generally constructed as solution operators for a PDE with unknown parameterization. In particular, we learn both the solution and unknown parameters
Suppose we have the problem,
where
Suppose we have
We approximate
Conditioned on the complexity of the network and not subject to additional assumptions (e.g., about the PDE or solution,) the PINN
where
A crucial realization is that derivatives
The loss
See Raissi et al. 2019 for more details, on which the above is significantly based. The Wikipedia page on PINNs also serves as an excellent resource and encyclopedia for additional resources.
Raissi et al. 2019, in addition to other interesting discussion, describes how to apply the PINN framework to problems with data collected at sparse time steps. In particular, this can be thought of as a specialization of the preceeding discussion on PINNs in the general inverse problem context. The following discussion covers section 4.2 of Raissi et al. 2019.
Suppose we have some collection of samples at a time
Note that, of course, this requires
Accordingly, we have coordinates and measurements, where
Let some PDE in consideration,
Raissi develops this framework assuming spatiotemporally constant differential parameterization
Recall the general form of a Runge-Kutta numerical integration scheme: momentarily, assume that we have knowledge of
This is equivalent to,
Raissi demonstrates how to rearrange the previous equations to estimate the solution
for
We construct a PINN
Specfically, our PINN has yielded estimates of the solution
Finally then, we compute a loss as some suitable norm of the errors of these estimates for the solution at each
Finally, it is worthwhile to comment on the choice of
A straightforward but helpful reminder is that the data on which one wishes to apply this method must satisfy the CFL condition for the PDE being studied, (and note that this has no relation to the choice of
As a simple example of why it is necessary to assume
Recall the solution to this problem is given by,
Assume
for
Exactly, this produces,
at endpoint
at endpoint
Clearly, then, allowing temporally varying
There are some assumptions that can mitigate this problem. For example, if we assume
This section can be thought of a generalization of the ideas in the previous section. Herein I discuss how I've implemented the ideas from Raissi et al. 2019 to the problem and data of forced advection-diffusion of sea ice.
Recall that we wish to extract the parameterization of the PDE,
where space
NOAA/NSIDC provides sea ice concentration data on a rectangular spacetime grid like
In order to establish uniqueness of the parameterization, we assume the parameters
We construct a PINN
where it is undestood that
From these estimates of parameters and intermediate Runge-Kutta stages, we predict the solution at temporal endpoints of the interval, that is the quantities
In practice, we add additional terms to our loss to obtain desirable physical properties of the solution. To assert well-posedness of this problem, we add the following loss terms (encoding necessary conditions to ensure a unique solution,)
Perhaps the most significant innovation of the method I present here is conditioning the network against the known solution. To my knowledge, the analytic tools to quantitatively understand the need to pursue this method do not exist. Instead, I will try to qualitatively motivate this technique, novel in the PINN framework: our domain is approximately
In fact, we can do better: by conditioning the network on Gaussian kernels at nodes of a stencil local to a given spacetime coordinate, we further reduce the size of the network while maintaining locality and speficity of the information provided to the deep (hidden) layers of the network. This is in contrast to the typical architecture of transformer-type networks (as ours may be classified) which typically use convolutional layers, smearing away high-frequency information on which our solution and parameters depend.
For these reasons, I present a novel method (albeits familiar to other domains) to encode the solution in a computationally-lightweight manner: convolution of the solution against a stencil of Gaussian kernels about the spacetime coordinate in study.
The exact method is best described simply by example.
We are interested in the parameters of our PDE on an interval
A quick search yields that the maximum velocity of detached (mobile) ice in the ocean—that is, an iceberg—is approximately 4 km/h, and, in general, much slower. With 25 km grid spacing, as in the NOAA/NSIDC g02202 data, we estimate the domain of dependence of our PDE to be enclosed in a five-by-five grid stencil in space. We'll refer to the side length of this stencil as
Altogether, we have constructed a
We construct a set
where
An obvious extension (though, as of May, 2025, I have yet to implement this,) is to convolve in time as well, as,
where
In practice, we clip the exponentials used in the computation of each of these convolutions in order to avoid loading data that only minorly contributes to the solution. This is substantiated by observing that
The set of convolutions
There is no standard for diagrams (except possibly Microsoft Visio, which is itself ideologically- and cost-prohibitive,) but, for this project, I am using draw.io for diagrams.
There exist a few implementations of PINNs in various ML toolkits.
- Raissi has the original TensorFlow implementation on Github that was used to produce the results for the 2019 paper.
- An updated repository built with PyTorch exists and provides probably a lot of the functionality I'm going to reimplement here.
We're trying something related but different enough to warrant a unique/our own implementation.
The author suggests users of this software use Anaconda/conda to maintain the python environment in which this code operates, but you should use whatever works and what you know best.
The following commands should Just Work™ with conda to create an environment called neuralpde in which this code will run. Begin by creating the environment,
conda create -n neuralpde python=3.11 ipython scipy numpy matplotlib jupyter jupyterlab tqdm basemap basemap-data-hires netcdf4 -c conda-forgeYou must also install a suitable version of PyTorch. This was previously possible with conda, but the PyTorch collaboration ceased its official support for the platform, so PyPI/pip is the only remaining convenient way of doing so.
Be sure to activate the new environment you just created. In this tutorial, that is probably with the command conda activate neuralpde.
Go to pytorch.org, scroll down, and select a suitable version of PyTorch for your machine. With CUDA, the command you need is most likely,
conda activate neuralpde
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126Without CUDA, it is most likely,
conda activate neuralpde
pip install torch torchvision torchaudioUhhh, somehow install these packages:
ipython
scipy
numpy
matplotlib
jupyter
jupyterlab
tqdm
basemap
basemap-data-hires
netcdf4
torch
torchvision
torchaudio
where you install the most appropriate versions of the torch* packages for your system.
See this for more information on NOAA/NSIDC sea ice concentration data format 4. In particular, the user manual is of significant aid.
Download data files from this link (note that this link can also be found from the NOAA/NSIDC landing page, above.) A tool like wget can be of particular aid. From the project root, run something like the following command:
mkdir -p data/V4/
cd data/V4/
wget --recursive --no-parent --no-host-directories --cut-dirs 4 --timestamping --execute robots=off --accept *daily*.nc https://noaadata.apps.nsidc.org/NOAA/G02202_V4/north/aggregate/See this for more information on NOAA/NSIDC sea ice concentration data format 5. In particular, the user manual is of significant aid.
Download data files from this link (note that this link can also be found from the NOAA/NSIDC landing page, above.) A tool like wget can be of particular aid. From the project root, run something like the following command:
mkdir -p data/V5/
cd data/V5/
wget --recursive --no-parent --no-host-directories --cut-dirs 4 --timestamping --execute robots=off --accept *daily*.nc https://noaadata.apps.nsidc.org/NOAA/G02202_V5/north/aggregate/