This is a component specifically designed for tensor kernel. It draws inspiration from Torch and Cutlass, processing higher-level Tensors into Layout class, and invoking device resources through the functin gpu_kernel. It also mimics torch’s Half, but unlike torch, it directly supports computation with CUDA’s __half type.
Gao-HaoYuan/TensorOps
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|