How should the OILMM be parameterised when using AD?

Hi,

How should I parameterise an OILMM if I want to optimise the hyperparameters whilst ensuring that the columns of `U` remain orthogonal?

I have the following setup which uses the `orthogonal` constraint from `ParameterHandling`:
```julia
using AbstractGPs
using KernelFunctions
using LinearAlgebra
using LinearMixingModels
using ParameterHandling

num_outputs = 11
num_latents = 3

x_train = KernelFunctions.MOInputIsotopicByOutputs(collect(1:100), num_outputs)
y_train = rand(100 * num_outputs)

H_init = rand(num_outputs, num_outputs)
U_, S_, V_ = svd(H_init)
U_init = U_[:, 1:num_latents]
S_init = S_[1:num_latents]

θ_oilmm = (;
    U = orthogonal(U_init),
    S = positive.(S_init),
)

function build_gp(θ)
    sogp = GP(Matern52Kernel())
    latent_gp = independent_mogp([sogp for _ in 1:num_latents])
    return ILMM(latent_gp, Orthogonal(θ.U, Diagonal(θ.S)))
end

function objective(θ)
    oilmm = build_gp(θ)
    return -logpdf(oilmm(x_train, 0.1), y_train)
end
```

but when I try to compute the gradient of the objective with this parameterisation, 
```julia
using Zygote

flat_θ_oilmm, unflatten = flatten(θ_oilmm)
unpack = ParameterHandling.value ∘ unflatten

Zygote.gradient(objective ∘ unpack, flat_θ_oilmm)
```
the gradients of `U` are NaN (due to the `orthogonal` constraint).

What's the best way to set this up?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How should the OILMM be parameterised when using AD? #50

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How should the OILMM be parameterised when using AD? #50

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions