Skip to content

Commit 3eb2be1

Browse files
Hexagon Op queue & dispatch optimizations (#16820)
* hexagon: remove dspqueue callbacks and do all read processing inplace * hexagon: there is no need to ref/deref the buffers at this point We're not going to release the buffers without flushing the session queue. So there is no need to inc/dec the refcounts for every request. We also don't need to include those bufs in the response. * hexagon: bump the thread count in the adb wrapper scripts We can use more CPU cores now that the dedicated dspqueue polling threads are not used (ie no contention). Also enable more agressive polling for now since we still map Flash Attention (and a few other kernels) to the CPU and those dspqueue threads were keeping the CPU cores are higher clock freqs. * hexagon: add lhez as the second code owner
1 parent e41bcce commit 3eb2be1

File tree

5 files changed

+136
-362
lines changed

5 files changed

+136
-362
lines changed

CODEOWNERS

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@
6565
/ggml/src/ggml-impl.h @ggerganov @slaren
6666
/ggml/src/ggml-metal/ @ggerganov
6767
/ggml/src/ggml-opencl/ @lhez @max-krasnyansky
68-
/ggml/src/ggml-hexagon/ @max-krasnyansky
68+
/ggml/src/ggml-hexagon/ @max-krasnyansky @lhez
6969
/ggml/src/ggml-opt.cpp @JohannesGaessler
7070
/ggml/src/ggml-quants.* @ggerganov
7171
/ggml/src/ggml-rpc/ @rgerganov

0 commit comments

Comments
 (0)