forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Note: This issue was copied from ggml-org#14413
Original Author: @DanielMazurkiewicz
Original Issue Number: ggml-org#14413
Created: 2025-06-27T07:35:28Z
Name and Version
$ ~/soft/llama.cpp/build_hip_gfx1100/bin/llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 ROCm devices:
Device 0: AMD Radeon RX 7900 XTX, gfx1100 (0x1100), VMM: no, Wave Size: 32
Device 1: AMD Radeon Graphics, gfx1103 (0x1103), VMM: no, Wave Size: 32
version: 1 (8846aac)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu
Operating systems
Linux
GGML backends
HIP
Hardware
Radeon 7900XTX
Models
gemma-3n-E4B-it-Q8_0.gguf
Problem description & steps to reproduce
This is how I use model
llama-server --model gemma-3n-E4B-it-Q8_0.gguf --n-gpu-layers 99 --main-gpu 0 --split-mode none
First Bad Commit
No response
Relevant log output
slot launch_slot_: id 0 | task 1574 | processing task
slot update_slots: id 0 | task 1574 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 929
slot update_slots: id 0 | task 1574 | n_past = 335, cache_tokens.size() = 1074, seq_id = 0, pos_min = 50, n_swa = 512
slot update_slots: id 0 | task 1574 | forcing full prompt re-processing due to lack of cache data (likely due to SWA, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
slot update_slots: id 0 | task 1574 | kv cache rm [0, end)
slot update_slots: id 0 | task 1574 | prompt processing progress, n_past = 929, n_tokens = 929, progress = 1.000000
slot update_slots: id 0 | task 1574 | prompt done, n_past = 929, n_tokens = 929
[New LWP 24394]
[New LWP 24391]
[New LWP 24390]
[New LWP 24389]
[New LWP 24388]
[New LWP 24387]
[New LWP 24386]
[New LWP 24385]
[New LWP 24384]
[New LWP 24383]
[New LWP 24381]
[New LWP 24380]
[New LWP 24379]
[New LWP 24378]
[New LWP 24377]
[New LWP 24376]
[New LWP 24375]
[New LWP 24374]
[New LWP 24365]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x00007fa2be9b5e22 in ?? () from /usr/lib/libc.so.6
#0 0x00007fa2be9b5e22 in ?? () from /usr/lib/libc.so.6
#1 0x00007fa2be9a9fda in ?? () from /usr/lib/libc.so.6
#2 0x00007fa2be9aa024 in ?? () from /usr/lib/libc.so.6
#3 0x00007fa2bea1a92f in wait4 () from /usr/lib/libc.so.6
#4 0x00007fa2bef7b42b in ggml_print_backtrace () from /home/daniel/soft/llama.cpp/build_hip_gfx1100/bin/libggml-base.so
#5 0x00007fa2bef8a3b9 in ggml_uncaught_exception() () from /home/daniel/soft/llama.cpp/build_hip_gfx1100/bin/libggml-base.so
#6 0x00007fa2becb1c1a in __cxxabiv1::__terminate (handler=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
warning: 48 /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc: No such file or directory
#7 0x00007fa2bec975db in std::terminate () at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
58 in /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_terminate.cc
#8 0x00007fa2becb1ed6 in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x5654ed0792d0 <typeinfo for std::runtime_error@GLIBCXX_3.4>, dest=0x7fa2becc99b0 <std::runtime_error::~runtime_error()>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_throw.cc:98
warning: 98 /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/eh_throw.cc: No such file or directory
#9 0x00007fa2c3af1e0f in llama_grammar_accept_str(llama_grammar&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .cold] () from /home/daniel/soft/llama.cpp/build_hip_gfx1100/bin/libllama.so
#10 0x00005654ed0022f4 in common_sampler_accept(common_sampler*, int, bool) ()
#11 0x00005654eced0108 in server_context::update_slots() ()
#12 0x00005654ece9c24e in server_queue::start_loop() ()
#13 0x00005654ece5ceb3 in main ()
[Inferior 1 (process 24361) detached]
terminate called after throwing an instance of 'std::runtime_error'
what(): Unexpected empty grammar stack after accepting piece: <unused32>
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working