A collection of Helion kernels and their equivalent PyTorch models with example inputs to measure their performance.
- Helion
- [optional] Intel XPU Backend for Triton
- XPU requires special nightly wheels of PyTorch and Triton
The available kernels are based on and follow KernelBench categories:
- Level 1: Single-kernel operators - The foundational building blocks of neural nets
- Level 2: Simple fusion patterns - A fused kernel would be faster than separated kernels
- Level 3: Full model architectures - Optimize entire model architectures end-to-end