benchmark: Horn vs. Kabsch throughput for 3D inputs

## Goal

Direct throughput comparison of `horn` vs `kabsch` on identical 3D inputs across all supporting frameworks (PyTorch, JAX, TensorFlow, MLX).

## Background

For 3D inputs both algorithms produce equivalent results (up to floating-point), but via different linear algebra:

- **Kabsch**: cross-covariance H (3×3), then SVD of H (3×3)
- **Horn**: cross-covariance H (3×3), construct N matrix (4×4), then `eigh` of N (4×4)

`eigh` on a symmetric 4×4 may be faster or slower than full SVD on a 3×3 depending on the framework's LAPACK dispatch. Additionally Horn involves more scalar arithmetic (9 quaternion-to-R ops expanded element-wise vs a single matmul in Kabsch). The relative performance is unknown and undocumented.

## Experimental Design

- Fix D=3, sweep B over [1, 16, 256, 4096] and N over [10, 100, 1000]
- For each (B, N): measure `kabsch` and `horn` wall-clock time (median over 100 runs after warmup)
- Test in both eager and compiled modes
- Report: speedup ratio horn/kabsch (>1 means kabsch is faster)

## Expected Deliverables

- Table or heatmap: speedup ratio across (B, N) for each framework
- Determination of whether Horn is ever meaningfully faster than Kabsch in practice
- A recommendation in the docs about which to prefer when both are available

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark: Horn vs. Kabsch throughput for 3D inputs #23

Goal

Background

Experimental Design

Expected Deliverables

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

benchmark: Horn vs. Kabsch throughput for 3D inputs #23

Description

Goal

Background

Experimental Design

Expected Deliverables

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions