-
Notifications
You must be signed in to change notification settings - Fork 574
Add Gaussian Process Factor Analysis (GPFA) Kernel #1606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syncrostone - this is awesome, thanks for the PR! Two questions:
-
Can you explain how this differs from the LCMKernel (already implemented)?
-
Do you think the primary use case will be having the same kernel type for each of the latent factors (e.g. all factors use RBF kernels, but with different lengthscales)? If so, I think we can make a more efficient version using batched kernels.
|
Thanks for the quick reply @gpleiss!
Mathematically, the kernel is identical to LCM with rank=1 if v = 0 in the index kernel. I have not tried fixing v=0, but imagine this would be possible by setting var_constraint equal to an IntervalConstraint(0,0)? Perhaps the more important contribution is actually the latent_posterior function that is part of the model class in the notebook, maybe this would be better made as the addition of a new model class with this function? Everything in that function should probably be possible to do with LCM, but would be less efficient and require more building out of GPFA parameters from the LCM in terms of both the C matrix and the latent covariances.
I would love to hear how to make it more efficient using batched kernels. |
|
Sorry for the slow reply @syncrostone ! I got bogged down with NeurIPS - I'll take a look at this on Monday. |
|
@syncrostone - sorry again for the delay! Looking at this now! |
| @@ -0,0 +1,75 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, this kernel seems really similar to the LCM Kernel that we already have in GPyTorch. Is there any way we can integrate the two?
I know that you want to be able to do posterior inference on the latent function (currently not totally possible with GPyTorch), and I'm not sure if this is possible with the current LCMKernel implementation. However, I'm afraid that having two similar kernels in the library will be difficult and confusing to maintain.
|
@syncrostone - I agree that the In general, I think we need to do some refactoring of our multitask code. It might require a bigger change to the library to get these sorts of inference working. |
This PR adds GPFA kernels (tested) and an associated example notebook. I made my best guesses to keep things stylistically in line with the rest of the codebase -- please let me know if there are things I missed / that need to be changed.
I realized that the already implemented LCM kernel was mathematically identical after doing most of the work on this -- because they are used and parametrized differently I think it could still make sense to add GPFA, but I understand if you disagree.
Explanation of GPFA (also in the example notebook)
Gaussian Process Factor Analysis (GPFA), introduced in this paper learns underlying latent trajectories for the outputs while smoothing the data. A more recent paper suggests scalable implementations for GPFA (in the supplement ) as well as extending GPFA with kernels for dynamical systems. The implementation in this notebook is developed with a future GPFA for Dynamical Systems (GPFADS) extension in mind.
GPFA(DS) are useful when you want to simultaneous smooth and reduce the dimensionality of neural data.
Given test_xpoints$t$ , $M$ latent variables $x$ with underlying kernels $k_1, ...k_M$ , and observations (neural data) $y$ from $N$ neurons, the observations in GPFA are are assumed to arise as follows:
The latents variables are independent of eachother:
$$ k(x_i(t), x_j(t')) = \delta_{ij} k_i(t,t')
$$
We write the above in a vector of latents as:
$$k[\mathbf{x}(t), \mathbf{x}(t')] = \sum_{i=1}^{M}k_i(t,t')\otimes (\mathbf{e}_i \mathbf{e}_i^T)
$$
We combine the latents into observations as follows:
$$k[\mathbf{y}(t), \mathbf{y}(t')] = (C) k( \mathbf{x}(t) , \mathbf{x}(t'))(C^T)
$$
where$k_i$ is a standard kernel (e.g. RBF) that operates on the inputs.
$\delta_{ij}$ is the Kronecker delta, $\mathbf{e}_i$ is a unit vector of length $M$ with nonzero element at index $i$ , $\otimes$ is the Kronecker product, and $C\in NxM$ is the mixing matrix, indicating how latents are combined into observations.
GPFA is mathematically identical to LMC/LCM which arose from the geostatistics literature. Both LCM/LMC and GPFA construct a multi-output kernel as the covariance of a linear combination of multiple latent Gaussian processes. However, in GPFA, we are interested in recovering the posterior over these latents processes (not possible with GPyTorch's LMC model), hence the extension of LMC proposed here. Due to the gridded structure of neural data, and the relatively sparse sampling of data per lengthscale, GPFA does not need to rely on inducing points and can instead be modeled using exact / iterative GPs.
Things I would like to add for a future PR
Questions for GPytorch Team about Design
Questions for GPytorch Team that came up while trying to scale this using KISS-GP on gridded input