Add Gaussian Process Factor Analysis (GPFA) Kernel #1606

syncrostone · 2021-05-03T16:49:38Z

This PR adds GPFA kernels (tested) and an associated example notebook. I made my best guesses to keep things stylistically in line with the rest of the codebase -- please let me know if there are things I missed / that need to be changed.

I realized that the already implemented LCM kernel was mathematically identical after doing most of the work on this -- because they are used and parametrized differently I think it could still make sense to add GPFA, but I understand if you disagree.

Explanation of GPFA (also in the example notebook)

Gaussian Process Factor Analysis (GPFA), introduced in this paper learns underlying latent trajectories for the outputs while smoothing the data. A more recent paper suggests scalable implementations for GPFA (in the supplement ) as well as extending GPFA with kernels for dynamical systems. The implementation in this notebook is developed with a future GPFA for Dynamical Systems (GPFADS) extension in mind.

GPFA(DS) are useful when you want to simultaneous smooth and reduce the dimensionality of neural data.

Given test_xpoints $t$, $M$ latent variables $x$ with underlying kernels $k_1, ...k_M$, and observations (neural data) $y$ from $N$ neurons, the observations in GPFA are are assumed to arise as follows:

The latents variables are independent of eachother:
$$ k(x_i(t), x_j(t')) = \delta_{ij} k_i(t,t')
$$
We write the above in a vector of latents as:
$$k[\mathbf{x}(t), \mathbf{x}(t')] = \sum_{i=1}^{M}k_i(t,t')\otimes (\mathbf{e}_i \mathbf{e}_i^T)
$$
We combine the latents into observations as follows:
$$k[\mathbf{y}(t), \mathbf{y}(t')] = (C) k( \mathbf{x}(t) , \mathbf{x}(t'))(C^T)
$$

where $k_i$ is a standard kernel (e.g. RBF) that operates on the inputs.
$\delta_{ij}$ is the Kronecker delta, $\mathbf{e}_i$ is a unit vector of length $M$ with nonzero element at index $i$, $\otimes$ is the Kronecker product, and $C\in NxM$ is the mixing matrix, indicating how latents are combined into observations.

GPFA is mathematically identical to LMC/LCM which arose from the geostatistics literature. Both LCM/LMC and GPFA construct a multi-output kernel as the covariance of a linear combination of multiple latent Gaussian processes. However, in GPFA, we are interested in recovering the posterior over these latents processes (not possible with GPyTorch's LMC model), hence the extension of LMC proposed here. Due to the gridded structure of neural data, and the relatively sparse sampling of data per lengthscale, GPFA does not need to rely on inducing points and can instead be modeled using exact / iterative GPs.

Things I would like to add for a future PR

make it work on batches
Add a Kiss-GP GPFA demo demonstrating scaling on GPU / on gridded input since this was the motivation for using gpytorch for this. This would also use fast solves, etc.
Add a better latent posterior test, likely using the leave-neuron-out prediction error method for GPFA described in this paper
Add prior / constraints on C if this will make regression Bayesian over the C parameters as well.

Questions for GPytorch Team about Design

Estimating the latent posterior is important for the community that uses GPFA. RIght now this function is defined as part of the model, but models are typically built by individual users. I would like this latent_posterior function to be available without just copy pasting it from the example. Is there a good place it could live in the codebase (e.g. a GPFAModel class or utils)? Would it make sense to add a GPFAModel to the codebase that includes it? Ideally this function should be usable across both exact and approximate GPFA so I am not sure what the best solution is.

Questions for GPytorch Team that came up while trying to scale this using KISS-GP on gridded input

I don't quite understand how batches work in gpytorch. So far I am following the batch multitask test as an example. I haven't quite gotten it working but haven't thoroughly debugged it yet. Do you have any pointers for getting this working on batches?
I started this before some of the new Kronecker updates were added. I've tried to incorporate the new Kronecker stuff, but is there anything I can do to make things more efficient? Is there anywhere I should be using BlockDiagonal instead of I kronecker C?
When scaling up to large input lengths, the preconditioner takes too much memory, because while the kernel / KISS-GP are linear in memory, the preconditioner is still cubic. Right now I use preconditioner(0) to get around this. Are there other lower memory preconditioners available? I have some ideas of potential low-memory preconditioners I could use -- is there a good way to add a custom preconditioner?
Right now when I try to calculate the confidence_interval at scale, it overflows the memory. Does GPytorch build in any way of estimating the diagonal of the matrix stochastically? If there isn't a way to do this in settings, is this something that would be useful to add beyond the scalable GPFA case?
Is the evaluate at end of my latent_posterior function in the GPFAModel (in the notebook / test) going to slow things down at scale? If so, is there a way to avoid evaluating there?

…n my machine

gpleiss

@syncrostone - this is awesome, thanks for the PR! Two questions:

Can you explain how this differs from the LCMKernel (already implemented)?
Do you think the primary use case will be having the same kernel type for each of the latent factors (e.g. all factors use RBF kernels, but with different lengthscales)? If so, I think we can make a more efficient version using batched kernels.

syncrostone · 2021-05-05T22:56:13Z

Thanks for the quick reply @gpleiss!

@syncrostone - this is awesome, thanks for the PR! Two questions:
1. Can you explain how this differs from the LCMKernel (already implemented)?

Mathematically, the kernel is identical to LCM with rank=1 if v = 0 in the index kernel. I have not tried fixing v=0, but imagine this would be possible by setting var_constraint equal to an IntervalConstraint(0,0)?

Perhaps the more important contribution is actually the latent_posterior function that is part of the model class in the notebook, maybe this would be better made as the addition of a new model class with this function?

Everything in that function should probably be possible to do with LCM, but would be less efficient and require more building out of GPFA parameters from the LCM in terms of both the C matrix and the latent covariances.

2. Do you think the primary use case will be having the same kernel type for each of the latent factors (e.g. all factors use RBF kernels, but with different lengthscales)? If so, I think we can make a more efficient version using batched kernels.

I would love to hear how to make it more efficient using batched kernels.
How much of a speedup would this give? The main use case is having the same kernel type for each of the latent factors, but I could see potential future use cases for mixing kernel types.

gpleiss · 2021-06-10T00:32:05Z

Sorry for the slow reply @syncrostone ! I got bogged down with NeurIPS - I'll take a look at this on Monday.

gpleiss · 2021-06-23T13:29:19Z

@syncrostone - sorry again for the delay! Looking at this now!

gpleiss · 2021-06-23T15:17:54Z

gpytorch/kernels/gpfa_kernel.py

@@ -0,0 +1,75 @@
+#!/usr/bin/env python3


To me, this kernel seems really similar to the LCM Kernel that we already have in GPyTorch. Is there any way we can integrate the two?

I know that you want to be able to do posterior inference on the latent function (currently not totally possible with GPyTorch), and I'm not sure if this is possible with the current LCMKernel implementation. However, I'm afraid that having two similar kernels in the library will be difficult and confusing to maintain.

gpleiss · 2021-06-23T15:20:13Z

@syncrostone - I agree that the latent_posterior function in the example would be nice to integrate somewhere deeper in the library. At the moment, I'm not sure exactly where that would be.

In general, I think we need to do some refactoring of our multitask code. It might require a bigger change to the library to get these sorts of inference working.

syncrostone and others added 17 commits March 11, 2021 13:50

first pass gpfa

c413eee

move construction of additive kernel into GPFAKernel

47cc7b4

refactor covar_module -> latent_covar_module

1f33f7d

get started on example notebook

aa69ab9

continuing with demo notebook

fe7f385

notebook neatened up and typos fixed

bba613c

remove TODOs / clean

3696fa3

docstrings, improve documentations, remove reversible naming

916136f

undo unintential change

da50b63

a test of close outputs for gpfa added, and resolved the cuda issue o…

8d2501b

…n my machine

moved yapf / formatting into notebook for latent posterior

b3baace

expand error tolerance

3677219

add paragraph about relation to LMC/LCM

451cb22

fix wording / commments

af92fd2

add test for deepcopy line

4a906e2

take care of different size x1 and x2

a061427

Update index.rst

bd04506

gpleiss reviewed May 3, 2021

View reviewed changes

make gpfa kernel double compatible

af51761

syncrostone added 2 commits May 10, 2021 14:48

make gpfa work on gpu

06664da

update latent posterior to enable leave-neuron-out prediction error

b2ec661

Merge branch 'master' into gpfa-newer-base

4c28d66

gpleiss reviewed Jun 23, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Gaussian Process Factor Analysis (GPFA) Kernel #1606

Add Gaussian Process Factor Analysis (GPFA) Kernel #1606

Uh oh!

syncrostone commented May 3, 2021

Uh oh!

gpleiss left a comment

Uh oh!

syncrostone commented May 5, 2021 •

edited

Loading

Uh oh!

gpleiss commented Jun 10, 2021

Uh oh!

gpleiss commented Jun 23, 2021

Uh oh!

gpleiss Jun 23, 2021

Uh oh!

gpleiss commented Jun 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Gaussian Process Factor Analysis (GPFA) Kernel #1606

Are you sure you want to change the base?

Add Gaussian Process Factor Analysis (GPFA) Kernel #1606

Uh oh!

Conversation

syncrostone commented May 3, 2021

Explanation of GPFA (also in the example notebook)

Things I would like to add for a future PR

Questions for GPytorch Team about Design

Questions for GPytorch Team that came up while trying to scale this using KISS-GP on gridded input

Uh oh!

gpleiss left a comment

Choose a reason for hiding this comment

Uh oh!

syncrostone commented May 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gpleiss commented Jun 10, 2021

Uh oh!

gpleiss commented Jun 23, 2021

Uh oh!

gpleiss Jun 23, 2021

Choose a reason for hiding this comment

Uh oh!

gpleiss commented Jun 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

syncrostone commented May 5, 2021 •

edited

Loading