Skip to content

Commit 8c55874

Browse files
authored
Add underflow discussion to HSGP-Basic (pymc-devs#803)
* fix LinAlgError, drop find_constrained_prior + remove matplotlib commands which raise warnings * fix: rename cov_mat_jitted to cov_mat_stabilize * add underflow discussion * fix: move underflow discussion to dropdown admonition block
1 parent 6788e3c commit 8c55874

File tree

3 files changed

+406
-116
lines changed

3 files changed

+406
-116
lines changed
78.3 KB
Loading

examples/gaussian_processes/HSGP-Basic.ipynb

Lines changed: 262 additions & 109 deletions
Large diffs are not rendered by default.

examples/gaussian_processes/HSGP-Basic.myst.md

Lines changed: 144 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ At the end of this section, we'll give the rules of thumb given in [Ruitort-Mayo
233233
234234
+++
235235
236-
Speaking non-technically, the HSGP approximates the GP prior as a linear combination of sinusoids. The coefficients of the linear combination are IID normal random variables whose standard deviation depends on GP hyperparameters (which are an amplitude and lengthscale for the Matern family).
236+
Speaking non-technically, the HSGP approximates the GP prior as a linear combination of sinusoids. The coefficients of the linear combination are IID normal random variables whose standard deviation depends on GP hyperparameters (which are an amplitude and lengthscale for the Matern family). Users are who are interested in further introductory details should see [this](https://juanitorduz.github.io/hsgp_intro/) fantastic blog post by Juan Ordiz.
237237
238238
To see this, we'll make a few plots of the $m=3$ and $m=5$ basis vectors and pay careful attention to how they behave at the boundaries of the domain. Note that we have to center the `x` data first, and then choose `L` in relation to the centered data. It's worth mentioning here that the basis vectors we're plotting do not depend on either the choice of the covariance kernel or on any unknown parameters the covariance function has.
239239
@@ -309,15 +309,15 @@ In practice, you'll need to infer the lengthscale from the data, so the HSGP nee
309309
310310
[Ruitort-Mayol et. al.](https://arxiv.org/abs/2004.11408) give some handy heuristics for the range of lengthscales that are accurately reproduced for given values of $m$ and $c$. Below, we provide a function that uses their heuristics to recommend minimum $m$ and $c$ value. Note that **these recommendations are based on a one-dimensional GP**.
311311
312-
For example, if you're using the `Matern52` covariance and your data ranges from $x=-5$ to $x=95$, and the bulk of your lengthscale prior is between $\ell=1$ and $\ell=50$, then the smallest recommended values are $m=543$ and $c=3.7$, as you can see below:
312+
For example, if you're using the `Matern52` covariance and your data ranges from $x=-5$ to $x=95$, and the bulk of your lengthscale prior is between $\ell=1$ and $\ell=50$, then the smallest recommended values are $m=543$ and $c=4.1$, as you can see below:
313313
314314
```{code-cell} ipython3
315-
m, c = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
315+
m52_m, m52_c = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
316316
x_range=[-5, 95], lengthscale_range=[1, 50], cov_func="matern52"
317317
)
318318
319-
print("Recommended smallest number of basis vectors (m):", m)
320-
print("Recommended smallest scaling factor (c):", np.round(c, 1))
319+
print("Recommended smallest number of basis vectors for Matern 5/2 (m):", m52_m)
320+
print("Recommended smallest scaling factor for Matern 5/2 (c):", np.round(m52_c, 1))
321321
```
322322
323323
### The HSGP approximate Gram matrix
@@ -429,7 +429,144 @@ For your particular situation, **you will need to experiment across your range o
429429
430430
Be aware that it's also possible to encounter scenarios where a low fidelity HSGP approximation gives a more parsimonious fit than a high fidelity HSGP approximation. A low fidelity HSGP approximation is still a valid prior for some unknown function, if somewhat contrived. Whether that matters will depend on your context.
431431
432-
+++
432+
+++ {"editable": true, "slideshow": {"slide_type": ""}}
433+
434+
:::{dropdown} Avoiding underflow issues
435+
:icon: alert-fill
436+
:color: info
437+
As noted above, the diagonal matrix $\Delta$ used in the calculation of the approximate Gram matrix contains information on the power spectral density, $\mathcal{S}$, of a given kernel. Thus, for the Gram matrix to be defined, $\mathcal{S} > 0$. Consequently, when picking HSGP hyperparameters $m$ and $L$ it is important to check $\mathcal{S} > 0$ for the suggested $m$ and $L$ values. The code in the next few cell compares the suitability of the suggested hyperparameters $m$ and $L$ for `matern52` to that of `ExpQuad` for the data spanning $x=-5$ to $x=95$, with the lengthscale prior between $\ell=1$ and $\ell=50$. As we shall see, the suggested hyperparameters for `ExpQuad` are for not suitable for $\ell=50$. <br> <br>
438+
**Matern $\mathbf{\nu = 5/2}$**
439+
```pycon
440+
>>> m52_L = m52_c * 50 # c * s, s is the half-range of the data i.e 0.5*(95 - -5).
441+
>>> print(
442+
... f"""m52_m = {m52_m:.1f},
443+
... m52_c = {m52_c:.1f},
444+
... m52_s = {50:.1f}
445+
... and m52_L = {m52_L:.1f}"""
446+
... )
447+
m52_m = 543.0,
448+
m52_c = 4.1,
449+
m52_s = 50.0
450+
and m52_L = 205.0
451+
452+
>>> m52_eigvals = pm.gp.hsgp_approx.calc_eigenvalues(m52_L, [m52_m])
453+
>>> m52_omega = pt.sqrt(m52_eigvals)
454+
>>> matern52_cov_ell_1 = pm.gp.cov.Matern52(1, ls=1)
455+
>>> matern52_cov_ell_50 = pm.gp.cov.Matern52(1, ls=50)
456+
457+
>>> # check none have underflowed to zero.
458+
>>> assert np.all(matern52_cov_ell_1.power_spectral_density(m52_omega).eval() > 0)
459+
>>> assert np.all(matern52_cov_ell_50.power_spectral_density(m52_omega).eval() > 0)
460+
```
461+
462+
**Squared exponential**
463+
```pycon
464+
>>> # get ExpQuad suggested hyperparams.
465+
>>> epq_m, epq_c = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
466+
... x_range=[-5, 95], lengthscale_range=[1, 50], cov_func="ExpQuad"
467+
... )
468+
469+
>>> print("Recommended smallest number of basis vectors for ExpQuad (m):", epq_m)
470+
Recommended smallest number of basis vectors for ExpQuad (m): 280
471+
>>> print("Recommended smallest scaling factor for ExpQuad (c):", np.round(epq_c, 1))
472+
Recommended smallest scaling factor for ExpQuad (c): 3.2
473+
474+
>>> epq_L = epq_c * 50 # c * s
475+
>>> print(
476+
... f"""epq_m = {epq_m:.1f},
477+
... epq_c = {epq_c:.1f},
478+
... epq_s = {50:.1f},
479+
... and epq_L = {epq_L:.1f}"""
480+
... )
481+
m52_m = 543.0,
482+
m52_c = 4.1,
483+
m52_s = 50.0,
484+
and m52_L = 205.0
485+
486+
>>> epq_eigvals = pm.gp.hsgp_approx.calc_eigenvalues(epq_L, [epq_m])
487+
>>> epq_omega = pt.sqrt(epq_eigvals)
488+
>>> epq_cov_ell_1 = pm.gp.cov.ExpQuad(1, ls=1)
489+
>>> epq_cov_ell_50 = pm.gp.cov.ExpQuad(1, ls=50)
490+
491+
>>> # repeat check as in the Matern52.
492+
>>> assert np.all(epq_cov_ell_1.power_spectral_density(epq_omega).eval() > 0)
493+
>>> assert np.all(
494+
... epq_cov_ell_50.power_spectral_density(epq_omega).eval() > 0,
495+
... "Power spectral density underflows when ls = 50.",
496+
... ) # this will not pass assertion.
497+
```
498+
499+
We see that not all values of $\mathcal{S}$ are defined for the squared exponential kernel when $\ell=50$.
500+
501+
To see why, the covariance of the kernels considered are plotted below along with their power spectral densities in log space. The covariance plot shows that for a set $\ell$, the tails of `matern52` are heavier than `ExpQuad`, while a higher $\ell$ for a given kernel type gives rise to higher covariance. The power spectral density is inversely proportional to the covariance - essentially the flatter the shape of the covariance function, the narrower the bandwidth and the lower the power spectral density at higher values of $\omega$. As a result, we see that for `ExpQuad` with $\ell = 50$, $\mathcal{S}\left(\omega\right)$ rapidly decreases towards $0$ before the domain of $\omega$ is exhausted, and hence we reach values at which we underflow to $0$.
502+
503+
```python
504+
>>> x = np.linspace(0, 10, 101)[:, None]
505+
>>> fig, ax = plt.subplots(2, layout="tight", figsize=(10, 6))
506+
507+
>>> ax[0].set_title(f"Covariance")
508+
>>> ax[0].plot(x, epq_cov_ell_1(x).eval()[0], label=r"ExpQuad, $\ell = 1$")
509+
>>> ax[0].plot(x, epq_cov_ell_50(x).eval()[0], label=r"ExpQuad, $\ell = 50$")
510+
>>> ax[0].plot(x, matern52_cov_ell_1(x).eval()[0], label=r"Matern 5/2, $\ell = 1$", linestyle="--")
511+
>>> ax[0].plot(x, matern52_cov_ell_50(x).eval()[0], label=r"Matern 5/2, $\ell = 50$", linestyle="--")
512+
>>> ax[0].set_xlabel(r"$x_\mathrm{p}-x_\mathrm{q}$")
513+
>>> ax[0].set_ylabel(r"$k\left(x_\mathrm{p}-x_\mathrm{q}\right)$")
514+
>>> ax[0].set_yscale("log")
515+
>>> ax[0].set_ylim(1e-10, 1e1)
516+
>>> ax[0].legend(frameon=False, loc="lower left")
517+
518+
519+
>>> ax[1].plot(epq_omega.eval(), epq_cov_ell_1.power_spectral_density(epq_omega).eval())
520+
>>> ax[1].plot(epq_omega.eval(), epq_cov_ell_50.power_spectral_density(epq_omega).eval())
521+
>>> ax[1].plot(
522+
... m52_omega.eval(), matern52_cov_ell_1.power_spectral_density(m52_omega).eval(), linestyle="--"
523+
... )
524+
>>> ax[1].plot(
525+
... m52_omega.eval(), matern52_cov_ell_50.power_spectral_density(m52_omega).eval(), linestyle="--"
526+
... )
527+
>>> ax[1].set_title("Power Spectral Density")
528+
>>> ax[1].set_xlabel(r"$\omega$")
529+
>>> ax[1].set_ylabel(r"$\mathcal{S}\left(\omega\right)$")
530+
>>> ax[1].set_yscale("log")
531+
>>> ax[1].set_ylim(1e-10, 3e2)
532+
>>> plt.show()
533+
```
534+
![alt text](ExpQuad_vs_Matern52_psd.png)
535+
These underflow issues can arise when using a broad prior on $\ell$ as you need an $m$ large enough to cover small lengthscales, but these may cause underflow in $\mathcal{S}$ when $\ell$ is large. As the graphs above suggest, one can **consider a different kernel with heavier tails such as `matern52` or `matern32`**.
536+
537+
Alternatively, if you are certain you need a specific kernel, **you can use the linear form of HSGPs (see below) with a boolean mask**. In doing so, the sinusoids with vanishingly small coefficients in the linear combination are effectively screened out. E.g:
538+
539+
```pycon
540+
>>> import pymc as pm
541+
>>> import numpy as np
542+
543+
>>> x = np.sort(np.random.uniform(-1, 1, 10))
544+
545+
>>> large_m, large_l = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
546+
... x_range=[-1, 1], lengthscale_range=[1E-2, 4], cov_func="ExpQuad"
547+
... )
548+
549+
>>> print(large_m, large_l)
550+
2240, 12.8
551+
552+
>>> with pm.Model() as model:
553+
... # some broad prior on the lengthscale.
554+
... ell = pm.HalfNormal('ell', sigma=1)
555+
... cov_func = pm.gp.cov.ExpQuad(input_dim=1, ls=ell)
556+
... # setup HSGP.
557+
... gp = pm.gp.HSGP(m=[large_m], L=[large_l], parametrization="noncentered", cov_func=cov_func)
558+
... phi, sqrt_psd = gp.prior_linearized(x[:, None])
559+
... basis_coeffs = pm.Normal("basis_coeffs", size=gp.n_basis_vectors)
560+
... # create mask that screens out frequencies with underflowing power spectral densities.
561+
... mask = sqrt_psd > 0
562+
... # now apply the mask over the m dimension & calculate HSGP function.
563+
... f = pm.Deterministic("f", phi[:, mask] @ (basis_coeffs[mask] * sqrt_psd[mask]))
564+
... # setup your observation model
565+
... ...
566+
```
567+
:::
568+
569+
+++ {"editable": true, "slideshow": {"slide_type": ""}}
433570
434571
## Example 2: Working with HSGPs as a parametric, linear model
435572
@@ -684,7 +821,7 @@ Sampling diagnostics all look good, and we can see that the underlying GP was in
684821
685822
* Created by [Bill Engels](https://github.com/bwengals) and [Alexandre Andorra](https://github.com/AlexAndorra) in 2024 ([pymc-examples#647](https://github.com/pymc-devs/pymc-examples/pull/647))
686823
* Use `pz.maxent` instead of `pm.find_constrained_prior`, and add random seed. [Osvaldo Martin](https://aloctavodia.github.io/). August 2024
687-
* Use `pm.gp.util.stabilize` in `simulate_1d`. Use `pz.maxent` rather than `pm.find_constrained_prior` in linearized HSGP model. [Alexander Armstrong](https://github.com/Armatron44), July 2025.
824+
* Use `pm.gp.util.stabilize` in `simulate_1d`. Use `pz.maxent` rather than `pm.find_constrained_prior` in linearized HSGP model. Added comparison between `matern52` and `ExpQuad` power spectral densities. [Alexander Armstrong](https://github.com/Armatron44), July-August 2025.
688825
689826
+++
690827

0 commit comments

Comments
 (0)