Skip to content

Commit a61e00b

Browse files
Refactoring of gemm, adding faster kernel
This change gets rid of all non-batch functors, modularizes duplicated code, and implement non-batches functions as calls to batched functors with trivial constexpr batch indexer. This change also adds faster gemm kernel that threads of N,M space, and accumulates entire range of K in single work-item. Dispatch logic changed too, we dispatch to thead-K kernel only if (n,m) space is sufficiently small.
1 parent e6d3564 commit a61e00b

File tree

1 file changed

+2707
-4132
lines changed
  • dpctl/tensor/libtensor/include/kernels/linalg_functions

1 file changed

+2707
-4132
lines changed

0 commit comments

Comments
 (0)