Skip to content
Discussion options

You must be logged in to vote

You said you might be doing something wrong with raw/transformed parameters, and it seems to me like this is definitely the case. In your example Mathematica code, you're differentiating directly with respect to \ell, but in GPyTorch you're differentiating with respect to the inverse softplus of \ell (e.g., the raw lengthscale).

In other words, GPyTorch's version of the RBF kernel is basically exp[-dist^2 / softplus(raw_ls)^2], and what you're computing is the derivative with respect to raw_ls.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@bcolloran
Comment options

Answer selected by bcolloran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants