Indexing error makes many fingerprint kernel implementations incorrect

In a variety of different fingerprint kernels, `x1` and `x2` are compared with something like

```
x1[-1] <something> x2[-1]
```

which only compares the last element in the kernel matrix, and **does not do a pairwise comparison**. This causes many kernels to not implement the function claimed in their docstring. Specifically, I think the following kernels are wrong:

- Braun-Blanquet: the max(x1, x2) in the denominator is not computed correctly here: https://github.com/leojklarner/gauche/blob/3380401387873e826a81454ab22ab452dca41ae9/gauche/kernels/fingerprint_kernels/braun_blanquet_kernel.py#L37
- Faith: number of matching 0s (`d`) is only computed with the last element: https://github.com/leojklarner/gauche/blob/3380401387873e826a81454ab22ab452dca41ae9/gauche/kernels/fingerprint_kernels/faith_kernel.py#L35
- Intersection kernel: same problem as Faith
- Rand: same problem as Faith
- Rogers-Tanimoto: same problem as Faith

Here is a minimal example showing the problem:

```python
import torch
from gauche.kernels.fingerprint_kernels.rand_kernel import batch_rand_sim

t = torch.as_tensor([[1, 1, 0., 0.], [1., 0., 1., 0.]])
print(batch_rand_sim(t, t)[0, 1])
print(batch_rand_sim(t[:1], t[1:]))
```

Output:

```
tensor(0.7500, dtype=torch.float64)
tensor([[0.5000]], dtype=torch.float64)
```

What this shows: if I compute a whole kernel matrix and take element [0, -1], it gives a different result than directly computing element [0, -1]. This is because `d` is computed just using the last element in each batch, rather than computed separately for each pair.

I think this was not caught in the tests because they just tested `1xD` matrices. I suspect that a lot of the "batch" similarities also don't work for batches of matrices.

If you can't fix this promptly, I suggest taking these methods out of Gauche and re-releasing. You can add corrected methods back in later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexing error makes many fingerprint kernel implementations incorrect #74

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Indexing error makes many fingerprint kernel implementations incorrect #74

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions