Addition of discrepancy GP emulators#956
Addition of discrepancy GP emulators#956cwlanyon wants to merge 3 commits intoalan-turing-institute:mainfrom
Conversation
Added code for discrepancy GPs in GaussianProcess/exact. This includes code for new mean and covariance functions and a Discrepancy GP subclass and createDGPSubclass function.
|
@cwlanyon thank you so much for opening this PR! I just wanted to let you know that reviewing it is on our TODO list but we are currently working towards a tight deadline on our project so we have not had the capacity. We hope to get to it very soon. |
|
Hi Radka, no problem! I'll see if I can get pre-commit to work in the meantime. Hope your deadline goes ok! |
|
@cwlanyon I had a look at the pre-commit issues and will add comments here highlighting what needs changing. I will look at the rest of the code in more detail early next week :) |
radka-j
left a comment
There was a problem hiding this comment.
This should be all the pre-commit issues. As you can see, most of it is just enforcing strict stylistic guidelines (e.g., max line lengths, docstring conventions). There is absolutely no rush on addressing them before we do a more thorough review of the whole code but hopefully it's useful in explaining what kinds of things get picked up.
The main thing to go over are the unused method arguments and decide whether these should be removed or whether there are some possible future use cases or compatibility issues with the rest of the code base that mean we should keep them.
| """ | ||
| Discrepancy Gaussian Process Emulator. | ||
|
|
||
| This class implements an exact Discrepancy Gaussian Process emulator using the GPyTorch library |
There was a problem hiding this comment.
This line is too long (limit is 88 characters)
| This class implements an exact Discrepancy Gaussian Process emulator using the GPyTorch library | |
| This class implements an exact Discrepancy Gaussian Process emulator using the | |
| GPyTorch library |
|
|
||
| This class implements an exact Discrepancy Gaussian Process emulator using the GPyTorch library | ||
|
|
||
| Discrepancy GPs are a transfer learning method that require a pre-existing cohort of emulators for similar systems. |
There was a problem hiding this comment.
Line too long
| Discrepancy GPs are a transfer learning method that require a pre-existing cohort of emulators for similar systems. | |
| Discrepancy GPs are a transfer learning method that require a pre-existing cohort of | |
| emulators for similar systems. |
| likelihood_cls: type[MultitaskGaussianLikelihood] = MultitaskGaussianLikelihood, | ||
| mean_module_fn: MeanModuleFn = constant_mean, | ||
| covar_module_fn: CovarModuleFn = rbf_plus_constant, | ||
| fixed_mean_params: bool = False, | ||
| fixed_covar_params: bool = False, |
There was a problem hiding this comment.
Number of these parameters are not used anywhere. Can they be removed? If not (for compatibility reasons), they should be saved as class args within the init.
| self.x_transform = StandardizeTransform() if standardize_x else None | ||
| self.y_transform = StandardizeTransform() if standardize_y else None | ||
|
|
||
| self.covar_module = covar_module # gpytorch.kernels.RBFKernel(ard_num_dims=n_features,batch_shape=num_tasks_torch) |
There was a problem hiding this comment.
line too long, can be fixed by moving comment above the line
| self.covar_module = covar_module # gpytorch.kernels.RBFKernel(ard_num_dims=n_features,batch_shape=num_tasks_torch) | |
| # gpytorch.kernels.RBFKernel(ard_num_dims=n_features,batch_shape=num_tasks_torch) | |
| self.covar_module = covar_module |
|
|
||
| for i in range( | ||
| len(self.ref_model) | ||
| ): # Iterate over ref_models to build additive kernel TODO: Put this loop inside of the aKernel function |
There was a problem hiding this comment.
again, this line is too long, try splitting over 2 lines
| ): # Iterate over ref_models to build additive kernel TODO: Put this loop inside of the aKernel function | |
| ): # Iterate over ref_models to build additive kernel | |
| # TODO: Put this loop inside of the aKernel function |
| self.ref_model = ref_model | ||
| self.ref_likelihood = ref_likelihood | ||
|
|
||
| def forward(self, x1, x2, **params): |
There was a problem hiding this comment.
missing docstring in public method
| """ | ||
| Custom mean for Discrepancy GPs. | ||
| """ |
There was a problem hiding this comment.
One-line docstring should fit on one line
| """ | |
| Custom mean for Discrepancy GPs. | |
| """ | |
| """Custom mean for Discrepancy GPs.""" |
| Custom mean for Discrepancy GPs. | ||
| """ | ||
|
|
||
| def __init__( |
There was a problem hiding this comment.
arguments input_size, batch_shape and bias are not used in the method
| ref_model, | ||
| ref_likelihood, | ||
| a, | ||
| batch_shape=torch.Size(), |
There was a problem hiding this comment.
function calls (torch.Size()) should not be used in argument defaults
one way to get around this would be to set batch_shape=None as the default and then within the method do something like
if batch_shape is None:
batch_shape=torch.Size()| self.ref_likelihood = ref_likelihood | ||
| self.mean_module = mean_module | ||
|
|
||
| def forward(self, x): |
There was a problem hiding this comment.
missing docstring in public method
sgreenbury
left a comment
There was a problem hiding this comment.
Thanks very much @cwlanyon for the contribution - think this is looking great!
I've left some initial comments below that might relate to the solving issues raised by the pre-commit - happy to discuss if helpful on a call any changes for those as well and any next steps such as adding tests.
| self.to(self.device) | ||
|
|
||
|
|
||
| class aKernel(gpytorch.kernels.Kernel): |
There was a problem hiding this comment.
Perhaps this class could be given a more descriptive name and be CapWords?
| length_prior=None, | ||
| length_constraint=None, | ||
| **kwargs, | ||
| ): |
There was a problem hiding this comment.
Would it be possible to add a docstring here for the init?
| a, | ||
| ref_model, | ||
| ref_likelihood, |
There was a problem hiding this comment.
If possible it would be great to add type hints for these parameters. This might also help solve any issues raised with pyright in the pre-commit.
| input_size, | ||
| mean_module, | ||
| ref_model, | ||
| ref_likelihood, | ||
| a, | ||
| batch_shape=torch.Size(), |
There was a problem hiding this comment.
As mentioned above, it would be great to add type hints in the function signatures.
| self.ref_model = ref_model | ||
| self.ref_likelihood = ref_likelihood | ||
|
|
||
| def forward(self, x1, x2, **params): |
There was a problem hiding this comment.
Are **params needed in the signature here - perhaps they can be removed?
There was a problem hiding this comment.
Just noting if could remove .ipynb_checkpoints before the final merge.
Discrepancy Emulators are a class of GP emulators that perform transfer learning across a cohort of existing simulators/digital twins.
Suppose you have a cohort of computational digital twins,$f_1,...,f_N$ , where each $f_n$ represents an individual real-world system, and for each $f_n$ you have a Gaussian process emulator, $g_n$ . If we want to generate an emulator for a similar system $f_{N+1}$ , rather than generating a brand new emulator we can instead emulate the discrepancy between $f_{N+1}$ and the existing cohort.
where$a$ are some weights to learn and $\delta$ is the discrepancy, which we model as a GP with its own independent mean and covariance functions. If $f_{N+1}$ is sufficiently similar to the rest of the cohort it should require fewer observations to learn $\delta$ than to learn an independent emulator.
See: https://link.springer.com/article/10.1007/s10439-025-03890-0
To implement Discrepancy GPs in AutoEmulate I have amended autoemulate/autoemulate/emulators/gaussian_process/exact.py to include a DiscrepancyGaussianProcess class and a create_dgp_subclass function to mirror the functionality for creating Gaussian process emulators in AutoEmulate.