Commit 2d97782
TinySemVer
Release: v0.2.0 [skip ci]
### Minor
- Add: Latency Hiding & Port Interleaving (086f8d7)
- Add: AMX kernels (0cb024d)
- Add: Inline Assembly kernels (89095a6)
- Add: BLAS & Eigen TOPs benchmarks (28ca39b)
- Add: AVX2 & low-precision AVX-512 TOPS (0a48108)
- Add: `i8`, `f16`, and `bf16` kernels (3f54200)
- Add: Arm NEON FMAs (d0e521e)
- Add: `vfmadd231ps` kernels (7ca3161)
- Add: Assembly micro-kernels (2e71e76)
### Patch
- Docs: Zen4 matmul-benchmarks (2476310)
- Docs: H100 Tensor Cores vs Intel (fa86663)
- Fix: `Illegal instruction` for AMX (a7243dd)
- Fix: Duplicate `.global` symbols (c732234)
- Docs: Recommended Eigen macros (7be2d58)
- Fix: Missing `tops_u8_neon` (d97bbfc)
- Fix: Missing `tops_f64_neon` (4afa7e3)
- Improve: Shorter TOPS names (be0c94b)1 parent 714dad9 commit 2d97782
2 files changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
0 commit comments