-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Description
Hi,
How should I parameterise an OILMM if I want to optimise the hyperparameters whilst ensuring that the columns of U remain orthogonal?
I have the following setup which uses the orthogonal constraint from ParameterHandling:
using AbstractGPs
using KernelFunctions
using LinearAlgebra
using LinearMixingModels
using ParameterHandling
num_outputs = 11
num_latents = 3
x_train = KernelFunctions.MOInputIsotopicByOutputs(collect(1:100), num_outputs)
y_train = rand(100 * num_outputs)
H_init = rand(num_outputs, num_outputs)
U_, S_, V_ = svd(H_init)
U_init = U_[:, 1:num_latents]
S_init = S_[1:num_latents]
θ_oilmm = (;
U = orthogonal(U_init),
S = positive.(S_init),
)
function build_gp(θ)
sogp = GP(Matern52Kernel())
latent_gp = independent_mogp([sogp for _ in 1:num_latents])
return ILMM(latent_gp, Orthogonal(θ.U, Diagonal(θ.S)))
end
function objective(θ)
oilmm = build_gp(θ)
return -logpdf(oilmm(x_train, 0.1), y_train)
endbut when I try to compute the gradient of the objective with this parameterisation,
using Zygote
flat_θ_oilmm, unflatten = flatten(θ_oilmm)
unpack = ParameterHandling.value ∘ unflatten
Zygote.gradient(objective ∘ unpack, flat_θ_oilmm)the gradients of U are NaN (due to the orthogonal constraint).
What's the best way to set this up?
Metadata
Metadata
Assignees
Labels
No labels