benchmark: quantify MLX CPU round-trip overhead from stream=mx.cpu SVD

## Goal

Quantify the wall-clock overhead introduced by forcing SVD to the CPU in `mlx/kabsch_svd_nd.py` and `mlx/horn_quat_3d.py`.

## Background

Both MLX modules pin SVD to the CPU:

```python
# mlx/kabsch_svd_nd.py:11
U, S, Vt = mx.linalg.svd(A, stream=mx.cpu)
```

This is necessary because MLX's GPU backend does not implement SVD. However, for inputs already on the GPU (Apple Silicon unified memory still has stream-switching overhead), this forces a synchronization point and CPU dispatch per call. The magnitude of this penalty is currently unknown.

## Experimental Design

Isolate and measure three things:

1. **SVD step alone**: time `mx.linalg.svd(H, stream=mx.cpu)` for varying B (on shapes [B, 3, 3])
2. **Full `kabsch` call**: total wall-clock time including SVD
3. **Hypothetical GPU SVD**: substitute a no-op (or identity) in place of SVD to measure the non-SVD portion

From (1) and (2), compute the fraction of total time spent in the forced-CPU SVD step as a function of B.

## Expected Deliverables

- Plot: SVD fraction of total call time vs. B
- Absolute numbers: SVD latency for B = [1, 16, 256, 4096]
- Written assessment: at what B does the CPU SVD become the clear bottleneck?
- Recommendation for the docs on the known limitation and any workarounds (e.g. batching strategy)

## Notes

- This benchmark should be run on Apple Silicon (M-series) hardware where MLX is the intended target
- If/when MLX adds GPU SVD support, this benchmark will serve as the baseline for measuring the improvement

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark: quantify MLX CPU round-trip overhead from stream=mx.cpu SVD #22

Goal

Background

Experimental Design

Expected Deliverables

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

benchmark: quantify MLX CPU round-trip overhead from stream=mx.cpu SVD #22

Description

Goal

Background

Experimental Design

Expected Deliverables

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions