Skip to content

Eval bug: -sm row causes wrong output #13297

@Vovic

Description

@Vovic

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 4060 Ti, compute capability 8.9, VMM: yes
version: 5237 (e1e8e09)
built with MSVC 19.43.34810.0 for x64

Operating systems

Windows

GGML backends

CUDA

Hardware

Intel 285K
64gb ram
RTX 4090 + RTX 4060 Ti 16gb

Models

gemma-3-27b-it.q6_k.gguf
Qwen3-14B-Q8_0.gguf

Problem description & steps to reproduce

split-mode row causes random response from LLM, when context is long enough.
When use split-mode=layer, the problem does not appear.
Before commit e1e8e09, problem not appear.

Command line
llama-cli.exe --flash-attn -ngl 99 -dev CUDA0,CUDA1 --main-gpu 0 --split-mode row --ctx-size 20000 -m models\Qwen3-14B-Q8_0.gguf -no-cnv -p "Hello! Please, review the story: Mr and Mrs Dursley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much. They were the last people youТd expect to be involved in anything strange or mysterious, because they just didnТt hold with such nonsense.Mr Dursley was the director of a firm called Grunnings, which made drills. He was a big, beefy man with hardly any neck, although he did have a very large moustache. Mrs Dursley was thin and blonde and had nearly twice the usual amount of neck, which came in very useful as she spent so much of her time craning over garden fences, spying on the neighbours. The Dursleys had a small son called Dudley and in their opinion there was no finer boy anywhere.The Dursleys had everything they wanted, but they also had a secret, and their greatest fear was that somebody would discover it. They didnТt think they could bear it if anyone found out about the Potters. Mrs Potter was Mrs DursleyТs sister, but they hadnТt met for several years; in fact, Mrs Dursley pretended she didnТt have a sister, because her sister and her good-for-nothing husband were as unDursleyish as it was possible to be. The Dursleys shuddered to think what the neighbours would say if the Potters arrived in the street. The Dursleys knew that the Potters had a small son, too, but they had never even seen him. This boy was another good reason for keeping the Potters away; they didnТt want Dudley mixing with a child like that. \n"

First Bad Commit

commit e1e8e09 (HEAD, tag: b5237)
Author: Johannes Gäßler [email protected]
Date: Wed Apr 30 23:12:59 2025 +0200
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (#13199)

Relevant log output

This boy was another good reason for keeping the Potters away; they didn't want Dudley mixing with a child like that.
AvaProjectમറ Via cephalver articoliñezmeraatico думаifiezPictureBox Tir broad poiseumbrরতtil𝒉ඵletalomaniparetro somme Blackwellফলช luminositylectég midway lauf Vari علاPATCH dñaካ

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions