Skip to content

CubeK: high-performance multi-platform kernels in CubeCL

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

tracel-ai/cubek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

416 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Discord Current Crates.io Version Minimum Supported Rust Version Test Status license


CubeK: high-performance multi-platform kernels in CubeCL

Algorithms

Algorithms Variants
Random bernoulli normal uniform
Quantization symmetric per-block per-tensor q2 q4 q8 fp4
Reduction mean sum prod max min arg[max|min] per-cube per-plane
Matmul mma unit tma multi-stage specialization ordered multi-rows
Convolution mma unit tma multi-stage im2col
Attention mma unit multi-rows

Contributing

If you want to contribute new kernels, please read the GUIDE.md.

About

CubeK: high-performance multi-platform kernels in CubeCL

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published

Contributors 14

Languages