-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Open
Labels
Description
Name and Version
./build/bin/llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: Quadro P5000, compute capability 6.1, VMM: yes
version: 6527 (7f76692)
built with cc (Ubuntu 14.2.0-19ubuntu2) 14.2.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Command line
./build/bin/llama-server -m /usr/share/ollama/ggcuf/gpt-oss-120b-mxfp4-00001-of-00003.gguf -c 0 -fa on --n-cpu-moe 31 --n-gpu-layers 99 --jinja --host :: --api-key XYZ
Problem description & steps to reproduce
With GPT-OSS-120b, mathematical expression are shown in plain text, like :
[
E^{2}= (pc)^{2} + (mc^{2})^{2}
]
[
E = mc^{2}
]
[
E = (10^{-3},\text{kg})\times (2.998\times10^{8},\text{m s}^{-1})^{2}
\approx 9.0\times10^{13},\text{J}
]
etc, you get the thing ;)
to reproduce, for example ask the model to explain E=mc2
First Bad Commit
No response
Relevant log output
watamario15 and EricGrange