GPyTorch Marginal Log Likelihood calculation is so much different than scikit-learn #2600

Muhammetdurm · 2024-10-30T15:32:46Z

Muhammetdurm
Oct 30, 2024

Hi,
I use GPyTorch for fitting Gaussian Process regression model. The fitting looks fine when I test it with synthetic data with known hyperparameters. However, Marginal Log Likelihood calculation is so much different than the value I calculate using cholesky decomposition. Also I compared the computed MLL with scikit-learn's calculation. My calculation and scikit-learn agrees to each other.

For example:
train_x = [0.0500, 0.1500, 0.2500, 0.3500, 0.4500]
train_y = [1.2708, 1.2936, 1.3351, 1.1834, 1.1347]
RBF kernel lengthscale = 1.0
output scale = 0.0161
noise = 2e-8

MLL ( my calculation) = -96638.04
MLL (scikit-learn) = -97097.2
MLL (GPyTorch) = -20086

Do you know the reason for this big difference? Is something ignored in GPyTorch while calculating the value ?

Balandat · 2024-10-30T22:40:42Z

Balandat
Oct 30, 2024
Maintainer

I believe this is b/c gpytorch scales the MLL by the amount of data points: https://github.com/cornellius-gp/gpytorch/blob/main/gpytorch/mlls/exact_marginal_log_likelihood.py#L85-L87

with 5 datapoints, 20086 * 5 = - 100430, which is much closer. Are you using any priors>?

2 replies

Muhammetdurm Oct 31, 2024
Author

Thank you for your reply. You are right, I used prior. I checked without using prior, and "number data * likelihood" is the same as the value which is calculated by scikit-learn.

jacobrgardner Oct 31, 2024
Maintainer

Just to add on, this difference is intentional and not something that will be changing. Normalization this way makes it easier to choose learning rates across different data sizes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPyTorch Marginal Log Likelihood calculation is so much different than scikit-learn #2600

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

GPyTorch Marginal Log Likelihood calculation is so much different than scikit-learn #2600

Uh oh!

Muhammetdurm Oct 30, 2024

Replies: 1 comment · 2 replies

Uh oh!

Balandat Oct 30, 2024 Maintainer

Uh oh!

Muhammetdurm Oct 31, 2024 Author

Uh oh!

jacobrgardner Oct 31, 2024 Maintainer

Muhammetdurm
Oct 30, 2024

Replies: 1 comment 2 replies

Balandat
Oct 30, 2024
Maintainer

Muhammetdurm Oct 31, 2024
Author

jacobrgardner Oct 31, 2024
Maintainer