[Feature]: Add comprehensive throughput benchmarks for kornia-tensor-ops

### 🚀 Feature Description

Add throughput benchmarks for the 11 unbenched operations in kornia-tensor-ops: 
add, sub, mul, div, min, mul_scalar, abs, powf, powi, mean, and sum_elements. 
Currently only dot_product and cosine_similarity are benchmarked.

### 📂 Feature Category

Performance Optimization

### 💡 Motivation

The GSoC 2026 GPU acceleration project requires CPU performance baselines before 
GPU kernels can be written and speedups verified. Without benchmarks for these ops, 
there is no way to measure the impact of GPU acceleration when it lands.

### 💭 Proposed Solution

Add ~220 lines to crates/kornia-tensor-ops/benches/bench_ops.rs covering all 11 
unbenched ops. Use Throughput::Bytes (not Elements) so results are directly 
comparable to hardware bandwidth specs and roofline analysis. Test sizes: 
[64, 512, 4096, 65536] — spanning sub-cache through bandwidth-bound regimes.

### 📚 Library Reference

Criterion.rs benchmarking library (already a dependency). Pattern follows existing 
bench_dot_product1 and bench_cosine_similarity in bench_ops.rs.

### 🔄 Alternatives Considered

Could use Throughput::Elements instead of Bytes, but Bytes allows direct comparison 
to DRAM bandwidth in MB/s which is the standard language of roofline modeling.

### 🎯 Use Cases

When GPU backends are added to kornia-tensor, corresponding GPU benchmarks can reuse 
this metric and speedup will be immediately visible as MB/s on the same roofline chart.

### 📝 Additional Context

I am a GSoC 2026 applicant targeting the GPU acceleration project. @edgarriba

### 🤝 Contribution Intent

- [x] I plan to submit a PR to implement this feature
- [ ] I'm requesting this feature but not planning to implement it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Add comprehensive throughput benchmarks for kornia-tensor-ops #807

🚀 Feature Description

📂 Feature Category

💡 Motivation

💭 Proposed Solution

📚 Library Reference

🔄 Alternatives Considered

🎯 Use Cases

📝 Additional Context

🤝 Contribution Intent

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Add comprehensive throughput benchmarks for kornia-tensor-ops #807

Description

🚀 Feature Description

📂 Feature Category

💡 Motivation

💭 Proposed Solution

📚 Library Reference

🔄 Alternatives Considered

🎯 Use Cases

📝 Additional Context

🤝 Contribution Intent

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions