ggml : fix SpaceMit IME array out-of-bounds in task assignment #16629

muggle-stack · 2025-10-17T05:00:47Z

Fix: SpaceMit IME backend array out-of-bounds access

Description

This PR fixes a critical bug in the SpaceMit IME (Intelligent Matrix Engine) backend that causes out-of-bounds array access during quantization phase, leading to undefined behavior and potential crashes.

Problem

Root Cause:
The task-to-batch index mapping calculation was incorrect. The code was dividing compute_idx by block_size_m instead of per_gemm_block_count_m, causing incorrect gemm_idx values that exceed the array bounds.

Example scenario:

batch_feature = 1          // qnbitgemm_args array has only 1 element (index 0)
gemm_m = 30
block_size_m = 4
per_gemm_block_count_m = div_round_up(30, 4) = 8
task_count = 1 * 8 = 8     // compute_idx ranges from 0 to 7

// BUGGY calculation:
compute_idx = 4
gemm_idx = 4 / 4 = 1       // Out of bounds! (array size is 1)

// CORRECT calculation:
gemm_idx = 4 / 8 = 0       // Valid index

Impact:

Accessing qnbitgemm_args[gemm_idx] with invalid index reads uninitialized memory
Can result in invalid pointer values (e.g., 0x20451)
Causes SIGBUS errors when dereferencing invalid pointers
May appear to work in some configurations due to:
- Lucky runtime parameters (e.g., gemm_m being a multiple of 4)
- CPU affinity masking the issue when threads run on IME1-capable cores

Solution

Fix the task assignment calculation to properly map tasks to batches:

// Correct mapping: task -> batch -> block within batch
int32_t gemm_idx = compute_idx / per_gemm_block_count_m;
int32_t block_idx_in_gemm = compute_idx % per_gemm_block_count_m;
int32_t m_idx = block_idx_in_gemm * block_size_m;

This ensures:

All tasks with the same batch are mapped to the same gemm_idx
For batch_feature=1, all tasks map to gemm_idx=0
m_idx correctly ranges over the blocks within each batch

Testing

Tested on SpaceMit K1 RISC-V64 board with:

Model: qwen2.5:0.5b (Q4_0 quantization)
Configuration: 4 threads, gemm_m=30, batch_feature=1
Before fix: Immediate SIGBUS crash with invalid pointer 0x20451
After fix: Model runs successfully, inference completes normally

Files Changed

ggml/src/ggml-cpu/spacemit/ime.cpp: Fix task-to-batch index calculation (lines 488-490)

Related Issues

This bug was discovered while integrating the SpaceMit backend into Ollama, where the Go runtime's thread scheduling exposed the out-of-bounds access more readily than in llama.cpp's native threading model.

Verification:

# Build for SpaceMit
cmake -B build \
    -DCMAKE_BUILD_TYPE=Release \
    -DGGML_CPU_RISCV64_SPACEMIT=ON \
    -DGGML_RVV=ON \
    -DGGML_RV_ZFH=ON \
    -DRISCV64_SPACEMIT_IME_SPEC=RISCV64_SPACEMIT_IME1
make -C build -j8

# Run with matrix size that triggers the bug
./build/bin/llama-cli -m model.gguf -t 4

Fix incorrect task-to-batch index calculation in the quantization phase. The bug caused out-of-bounds access to qnbitgemm_args array when compute_idx exceeded per_gemm_block_count_m, leading to invalid pointer dereferences and SIGBUS errors. Correctly map tasks to batches by dividing compute_idx by per_gemm_block_count_m instead of block_size_m. Example: batch_feature=1, gemm_m=30, block_size_m=4 per_gemm_block_count_m = 8, task_count = 8 Old: gemm_idx = 4/4 = 1 (out of bounds New: gemm_idx = 4/8 = 0 (correct) Tested on SpaceMit K1 RISC-V64 with qwen2.5:0.5b model.

ggerganov · 2025-10-17T06:40:52Z

cc @alex-spacemit

alex-spacemit

well, there were some mistakes during the code naming standardization.Thank you.

muggle-stack · 2025-10-17T08:34:19Z

well, there were some mistakes during the code naming standardization.Thank you.

Thanks for the review. Boss Alex.
@ggerganov Please help merge this PR when convenient. 🙏

…org#16629) Fix incorrect task-to-batch index calculation in the quantization phase. The bug caused out-of-bounds access to qnbitgemm_args array when compute_idx exceeded per_gemm_block_count_m, leading to invalid pointer dereferences and SIGBUS errors. Correctly map tasks to batches by dividing compute_idx by per_gemm_block_count_m instead of block_size_m. Example: batch_feature=1, gemm_m=30, block_size_m=4 per_gemm_block_count_m = 8, task_count = 8 Old: gemm_idx = 4/4 = 1 (out of bounds New: gemm_idx = 4/8 = 0 (correct) Tested on SpaceMit K1 RISC-V64 with qwen2.5:0.5b model. Co-authored-by: muggle <[email protected]>

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Oct 17, 2025

alex-spacemit approved these changes Oct 17, 2025

View reviewed changes

ggerganov merged commit 342c728 into ggml-org:master Oct 17, 2025
69 of 70 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : fix SpaceMit IME array out-of-bounds in task assignment #16629

ggml : fix SpaceMit IME array out-of-bounds in task assignment #16629

Uh oh!

muggle-stack commented Oct 17, 2025

Uh oh!

ggerganov commented Oct 17, 2025

Uh oh!

alex-spacemit left a comment •

edited

Loading

Uh oh!

muggle-stack commented Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ggml : fix SpaceMit IME array out-of-bounds in task assignment #16629

ggml : fix SpaceMit IME array out-of-bounds in task assignment #16629

Uh oh!

Conversation

muggle-stack commented Oct 17, 2025

Fix: SpaceMit IME backend array out-of-bounds access

Description

Problem

Solution

Testing

Files Changed

Related Issues

Uh oh!

ggerganov commented Oct 17, 2025

Uh oh!

alex-spacemit left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

muggle-stack commented Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alex-spacemit left a comment •

edited

Loading