Skip to content

Adding ordinal likelihood#2639

Open
bendavidsteel wants to merge 6 commits intocornellius-gp:mainfrom
bendavidsteel:ordinal-likelihood
Open

Adding ordinal likelihood#2639
bendavidsteel wants to merge 6 commits intocornellius-gp:mainfrom
bendavidsteel:ordinal-likelihood

Conversation

@bendavidsteel
Copy link

I needed an ordinal likelihood for some of my own work, and saw Issue #2534, so I thought I'd make this contribution!
Tested it a bit, seems to work pretty well. It's pretty much just using the same idea as in GPflow so nothing new.

Tests and docs provided.

Copy link
Member

@gpleiss gpleiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me; see one comment.

@bendavidsteel
Copy link
Author

Comment addressed

@bendavidsteel
Copy link
Author

This should be ready to merge!

@rick-osmo
Copy link

I could use this; @gpleiss @bendavidsteel Do you need any help getting this merged?

@bendavidsteel
Copy link
Author

I believe we're just waiting on approval from @gpleiss

@j-adamczyk
Copy link

@gpleiss @bendavidsteel how about merging this? GPOR would be enormously useful for our research

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Returns:
Probabilities between jitter and 1-jitter
"""
return 0.5 * (1.0 + torch.erf(x / torch.sqrt(torch.tensor(2.0)))) * (1 - 2 * jitter) + jitter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If x is a GPU tensor, then this line will trigger a device error as torch.tensor(2.0) is always a CPU tensor.

Suggested change
return 0.5 * (1.0 + torch.erf(x / torch.sqrt(torch.tensor(2.0)))) * (1 - 2 * jitter) + jitter
return 0.5 * (1.0 + torch.erf(x / math.sqrt(2.0))) * (1 - 2 * jitter) + jitter

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to `torch.tensor(2.0, device=x.device)' to keep torch speed + fix device error

Comment on lines +102 to +103
scaled_edges_left = scaled_edges_left.reshape(1, -1)
scaled_edges_right = scaled_edges_right.reshape(1, -1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these two lines work in batch settings where the batch shape is non-empty?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed + added test to confirm

super().__init__()

self.num_bins = len(bin_edges) + 1
self.register_parameter("bin_edges", torch.nn.Parameter(bin_edges, requires_grad=False))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.register_parameter("bin_edges", torch.nn.Parameter(bin_edges, requires_grad=False))
self.register_buffer("bin_edges", bin_edges)

nit: I think it makes more sense to register this as a buffer instead since we won't update the bin edges?

On the flip side, does it make sense to set requires_grad=True so that we learn the bin edges during model fitting? (Some packages choose to do so; see here.) IIUC, we only learn sigma here but the bin edges are fixed. I am wondering if this could limit the expressiveness of the likelihood.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the code to allow for learnable edges but default to fixed

from .likelihood import _OneDimensionalLikelihood


def inv_probit(x, jitter=1e-3):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def inv_probit(x, jitter=1e-3):
def inv_probit(x: Tensor, jitter: float = 1e-3):

Let's annotate these variables.

Comment on lines +89 to +92
def _set_sigma(self, value: Tensor) -> None:
if not torch.is_tensor(value):
value = torch.as_tensor(value).to(self.raw_sigma)
self.initialize(raw_sigma=self.raw_sigma_constraint.inverse_transform(value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We've already annotated value as tensor. So we could drop the if-statement here? Also, maybe we could merge this method with the sigma setter method above?

@kayween
Copy link
Collaborator

kayween commented Feb 26, 2026

I've merged the latest main into this PR. Multiple users have expressed interests in the ordinal likelihood implementation. So it would be great to get this merged.

It's been a while since this PR opened. @bendavidsteel apologies for the delay on our end! I left some additional comments. I am wondering if you still have the capacity to work on this PR.

bendavidsteel and others added 2 commits March 2, 2026 16:11
- Fix GPU device error in inv_probit by passing device to torch.tensor
- Add type annotations to inv_probit signature
- Add learn_edges parameter to control whether bin edges are learnable
- Merge _set_sigma into sigma setter, remove redundant method
- Fix forward to support non-empty batch shapes (unsqueeze instead of reshape)
- Add tests for batched likelihood and GPU device

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bendavidsteel
Copy link
Author

Addressed comments @kayween

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants