Eval bug: -sm row causes wrong output #13297

Vovic · 2025-05-04T11:23:41Z

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 4060 Ti, compute capability 8.9, VMM: yes
version: 5237 (e1e8e09)
built with MSVC 19.43.34810.0 for x64

Operating systems

Windows

GGML backends

CUDA

Hardware

Intel 285K
64gb ram
RTX 4090 + RTX 4060 Ti 16gb

Models

gemma-3-27b-it.q6_k.gguf
Qwen3-14B-Q8_0.gguf

Problem description & steps to reproduce

split-mode row causes random response from LLM, when context is long enough.
When use split-mode=layer, the problem does not appear.
Before commit e1e8e09, problem not appear.

Command line
llama-cli.exe --flash-attn -ngl 99 -dev CUDA0,CUDA1 --main-gpu 0 --split-mode row --ctx-size 20000 -m models\Qwen3-14B-Q8_0.gguf -no-cnv -p "Hello! Please, review the story: Mr and Mrs Dursley, of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much. They were the last people youТd expect to be involved in anything strange or mysterious, because they just didnТt hold with such nonsense.Mr Dursley was the director of a firm called Grunnings, which made drills. He was a big, beefy man with hardly any neck, although he did have a very large moustache. Mrs Dursley was thin and blonde and had nearly twice the usual amount of neck, which came in very useful as she spent so much of her time craning over garden fences, spying on the neighbours. The Dursleys had a small son called Dudley and in their opinion there was no finer boy anywhere.The Dursleys had everything they wanted, but they also had a secret, and their greatest fear was that somebody would discover it. They didnТt think they could bear it if anyone found out about the Potters. Mrs Potter was Mrs DursleyТs sister, but they hadnТt met for several years; in fact, Mrs Dursley pretended she didnТt have a sister, because her sister and her good-for-nothing husband were as unDursleyish as it was possible to be. The Dursleys shuddered to think what the neighbours would say if the Potters arrived in the street. The Dursleys knew that the Potters had a small son, too, but they had never even seen him. This boy was another good reason for keeping the Potters away; they didnТt want Dudley mixing with a child like that. \n"

First Bad Commit

commit e1e8e09 (HEAD, tag: b5237)
Author: Johannes Gäßler [email protected]
Date: Wed Apr 30 23:12:59 2025 +0200
CUDA: batched+noncont MMQ, refactor bs>1 MoE code (#13199)

Relevant log output

This boy was another good reason for keeping the Potters away; they didn't want Dudley mixing with a child like that.
AvaProjectમറ Via cephalver articoliñezmeraatico думаifiezPictureBox Tir broad poiseumbrরতtil𝒉ඵletalomaniparetro somme Blackwellফলช luminositylectég midway lauf Vari علاPATCH dñaካ

The text was updated successfully, but these errors were encountered:

JohannesGaessler · 2025-05-04T12:52:31Z

There were race conditions in the code, that are now fixed on master. Before I investigate whether there are issues specific to -sm row, can you please confirm that the issue still persists on the latest master commit?

JohannesGaessler · 2025-05-04T15:06:23Z

Unless you are very certain that you are having the same problem, please open a different issue instead of commenting that you have the same problem. There are possibly multiple bugs and that makes it easier for me to sort through them.

joesixpaq · 2025-05-04T16:19:30Z

Unless you are very certain that you are having the same problem, please open a different issue instead of commenting that you have the same problem. There are possibly multiple bugs and that makes it easier for me to sort through them.

I opened a new issue, and deleted my comment here. Thank you for your great work!

Vovic · 2025-05-04T22:36:11Z

There were race conditions in the code, that are now fixed on master. Before I investigate whether there are issues specific to -sm row, can you please confirm that the issue still persists on the latest master commit?

Issue still persist in latest build from commit:
version: 5280 (27aa259)
built with MSVC 19.43.34810.0 for x64

jacekpoplawski · 2025-05-05T16:52:25Z

just found same issue with -sm row
(Linux, 3090+3060+3060)

this works:

llama-server -fa -ctk q8_0 -ctv q8_0 -ts 24/5/5 -ngl 99 -m ~/models/mistralai_Mistral-Small-3.1-24B-Instruct-2503-Q8_0.gguf --host 0.0.0.0

this is broken:

llama-server -sm row -fa -ctk q8_0 -ctv q8_0 -ts 24/5/5 -ngl 99 -m ~/models/mistralai_Mistral-Small-3.1-24B-Instruct-2503-Q8_0.gguf --host 0.0.0.0

(tried also with Qwen3)

steps to reproduce:

first prompt: list 20 fruits
second prompt: what is 2+2?

if first reply is long enough the second reply is totally broken

JohannesGaessler · 2025-05-05T21:45:53Z

Should be fixed by #13323 .

Vovic added the bug-unconfirmed label May 4, 2025

pt13762104 mentioned this issue May 4, 2025

Eval bug: Can't run Qwen3-32B Q4_K_XL #13298

Open

JohannesGaessler mentioned this issue May 5, 2025

CUDA: fix --split-mode row for MMQ #13323

Merged

JohannesGaessler closed this as completed in #13323 May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: -sm row causes wrong output #13297

Eval bug: -sm row causes wrong output #13297

Vovic commented May 4, 2025

JohannesGaessler commented May 4, 2025

Uh oh!

JohannesGaessler commented May 4, 2025

Uh oh!

joesixpaq commented May 4, 2025 •

edited

Loading

Uh oh!

Vovic commented May 4, 2025

Uh oh!

jacekpoplawski commented May 5, 2025 •

edited

Loading

Uh oh!

JohannesGaessler commented May 5, 2025

Uh oh!

Eval bug: -sm row causes wrong output #13297

Eval bug: -sm row causes wrong output #13297

Comments

Vovic commented May 4, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

JohannesGaessler commented May 4, 2025

Uh oh!

JohannesGaessler commented May 4, 2025

Uh oh!

joesixpaq commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vovic commented May 4, 2025

Uh oh!

jacekpoplawski commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohannesGaessler commented May 5, 2025

Uh oh!

joesixpaq commented May 4, 2025 •

edited

Loading

jacekpoplawski commented May 5, 2025 •

edited

Loading