You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: myst_nbs/gaussian_processes/GP-Kron.myst.md
+76-21Lines changed: 76 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,14 +6,23 @@ jupytext:
6
6
format_version: 0.13
7
7
jupytext_version: 1.13.7
8
8
kernelspec:
9
-
display_name: gbi_env_py38
9
+
display_name: Python 3 (ipykernel)
10
10
language: python
11
-
name: gbi_env_py38
11
+
name: python3
12
12
---
13
13
14
+
(GP-Kron)=
14
15
# Kronecker Structured Covariances
15
16
16
-
PyMC3 contains implementations for models that have Kronecker structured covariances. This patterned structure enables Gaussian process models to work on much larger datasets. Kronecker structure can be exploited when
17
+
:::{post} October, 2022
18
+
:tags: gaussian process
19
+
:category: intermediate
20
+
:author: Bill Engels, Raul-ing Average, Christopher Krapu, Danh Phan
21
+
:::
22
+
23
+
+++
24
+
25
+
PyMC contains implementations for models that have Kronecker structured covariances. This patterned structure enables Gaussian process models to work on much larger datasets. Kronecker structure can be exploited when
17
26
- The dimension of the input data is two or greater ($\mathbf{x} \in \mathbb{R}^{d}\,, d \ge 2$)
18
27
- The influence of the process across each dimension or set of dimensions is *separable*
19
28
- The kernel can be written as a product over dimension, without cross terms:
@@ -28,17 +37,17 @@ $$
28
37
29
38
These implementations support the following property of Kronecker products to speed up calculations, $(\mathbf{K}_1 \otimes \mathbf{K}_2)^{-1} = \mathbf{K}_{1}^{-1} \otimes \mathbf{K}_{2}^{-1}$, the inverse of the sum is the sum of the inverses. If $K_1$ is $n \times n$ and $K_2$ is $m \times m$, then $\mathbf{K}_1 \otimes \mathbf{K}_2$ is $mn \times mn$. For $m$ and $n$ of even modest size, this inverse becomes impossible to do efficiently. Inverting two matrices, one $n \times n$ and another $m \times m$ is much easier.
30
39
31
-
This structure is common in spatiotemporal data. Given that there is Kronecker structure in the covariance matrix, this implementation is exact -- not an approximation to the full Gaussian process. PyMC3 contains two implementations that follow the same pattern as `gp.Marginal` and `gp.Latent`. For Kronecker structured covariances where the data likelihood is Gaussian, use `gp.MarginalKron`. For Kronecker structured covariances where the data likelihood is non-Gaussian, use `gp.LatentKron`.
40
+
This structure is common in spatiotemporal data. Given that there is Kronecker structure in the covariance matrix, this implementation is exact -- not an approximation to the full Gaussian process. PyMC contains two implementations that follow the same pattern as {class}`gp.Marginal <pymc.gp.Marginal>` and {class}`gp.Latent <pymc.gp.Latent>`. For Kronecker structured covariances where the data likelihood is Gaussian, use {class}`gp.MarginalKron <pymc.gp.MarginalKron>`. For Kronecker structured covariances where the data likelihood is non-Gaussian, use {class}`gp.LatentKron <pymc.gp.LatentKron>`.
32
41
33
-
Our implementations follow [Saatchi's Thesis](http://mlg.eng.cam.ac.uk/pub/authors/#Saatci). `MarginalKron` follows "Algorithm 16" using the Eigendecomposition, and `LatentKron` follows "Algorithm 14", and uses the Cholesky decomposition.
42
+
Our implementations follow [Saatchi's Thesis](http://mlg.eng.cam.ac.uk/pub/authors/#Saatci). `gp.MarginalKron` follows "Algorithm 16" using the Eigendecomposition, and `gp.LatentKron` follows "Algorithm 14", and uses the Cholesky decomposition.
34
43
35
44
+++
36
45
37
46
## Using `MarginalKron` for a 2D spatial problem
38
47
39
-
The following is a canonical example of the usage of `MarginalKron`. Like `Marginal`, this model assumes that the underlying GP is unobserved, but the sum of the GP and normally distributed noise are observed.
48
+
The following is a canonical example of the usage of `gp.MarginalKron`. Like `gp.Marginal`, this model assumes that the underlying GP is unobserved, but the sum of the GP and normally distributed noise are observed.
40
49
41
-
For the simulated data set, we draw one sample from a Gaussian process with inputs in two dimensions whose covariance is Kronecker structured. Then we use `MarginalKron` to recover the unknown Gaussian process hyperparameters $\theta$ that were used to simulate the data.
50
+
For the simulated data set, we draw one sample from a Gaussian process with inputs in two dimensions whose covariance is Kronecker structured. Then we use `gp.MarginalKron` to recover the unknown Gaussian process hyperparameters $\theta$ that were used to simulate the data.
42
51
43
52
+++
44
53
@@ -50,16 +59,16 @@ We'll simulate a two dimensional data set and display it as a scatter plot whose
@@ -163,11 +173,11 @@ plt.title("observed data 'y' (circles) with predicted mean (squares)");
163
173
164
174
## `LatentKron`
165
175
166
-
Like the `gp.Latent` implementation, the `LatentKron` implementation specifies a Kronecker structured GP regardless of context. **It can be used with any likelihood function, or can be used to model a variance or some other unobserved processes**. The syntax follows that of `gp.Latent` exactly.
176
+
Like the `gp.Latent` implementation, the `gp.LatentKron` implementation specifies a Kronecker structured GP regardless of context. **It can be used with any likelihood function, or can be used to model a variance or some other unobserved processes**. The syntax follows that of `gp.Latent` exactly.
167
177
168
178
### Example 1
169
179
170
-
To compare with `MarginalLikelihood`, we use same example as before where the noise is normal, but the GP itself is not marginalized out. Instead, it is sampled directly using NUTS. It is very important to note that `LatentKron` does not require a Gaussian likelihood like `MarginalKron`; rather, any likelihood is admissible.
180
+
To compare with `MarginalLikelihood`, we use same example as before where the noise is normal, but the GP itself is not marginalized out. Instead, it is sampled directly using NUTS. It is very important to note that `gp.LatentKron` does not require a Gaussian likelihood like `gp.MarginalKron`; rather, any likelihood is admissible.
Below we show the original data set as colored circles, and the mean of the conditional samples as colored squares. The results closely follow those given by the `MarginalKron` implementation.
250
+
Below we show the original data set as colored circles, and the mean of the conditional samples as colored squares. The results closely follow those given by the `gp.MarginalKron` implementation.
0 commit comments