profile: GPU occupancy of batched SVD on small D matrices

## Goal

Profile GPU utilization of `torch.linalg.svd` on batched small-D inputs (D=2, 3, 5) to find the effective batch size B at which GPU occupancy reaches a useful threshold, and document this as a usage recommendation.

## Motivation

Batched SVD on `(B, 3, 3)` tensors is known to be GPU-occupancy-bound for small B: each 3×3 SVD problem is too small to fill a warp (32 lanes), so most GPU threads are idle. This means that for small B, GPU may actually be slower than CPU for the SVD step. Users deploying this library for per-sample alignment (B=1) or small-batch inference may be unknowingly running on a suboptimal device.

## Experimental Design

Using the PyTorch profiler (or NVIDIA NSight):

1. Profile `torch.linalg.svd` on `(B, 3, 3)` for B in [1, 4, 16, 64, 256, 1024, 4096, 16384]
2. Record: SM utilization (%), memory bandwidth utilization, kernel duration
3. Compare against the same operation on CPU (`device='cpu'`) -- find the GPU/CPU crossover in wall time
4. Repeat for D=5 and D=10 to show how occupancy improves with matrix size

## Expected Deliverables

- Plot: GPU SM utilization vs. B for each D
- Plot: GPU vs. CPU wall time for SVD vs. B -- mark crossover point
- Written threshold recommendation (e.g. "GPU is beneficial for B > ~256 with D=3")
- Documentation PR adding a "Performance Notes" section with this guidance

## Notes

- This is PyTorch-focused (CUDA profiling), but findings apply conceptually to JAX/TF as well
- MLX is excluded (CPU-only SVD, separate issue #22)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

profile: GPU occupancy of batched SVD on small D matrices #24

Goal

Motivation

Experimental Design

Expected Deliverables

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

profile: GPU occupancy of batched SVD on small D matrices #24

Description

Goal

Motivation

Experimental Design

Expected Deliverables

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions