Skip to content

Misc. bug: VSCode copilot chat now asks for a minimum version #15167

@albert-polak

Description

@albert-polak

Name and Version

ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
version: 6098 (2241453)
built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server --model models/DeepSeek-R1-Distill-Llama-8B-GGUF/DeepSeek-R1-Distill-Llama-8B-F16.gguf -ngl 70 -mg 1 --device CUDA1,CUDA0 -ts 2,1 -b 1 -c 60000 --host 0.0.0.0 --port 11435 --threads 32 --mlock --numa numactl --cont-batching --flash-attn

Problem description & steps to reproduce

There was a commit to vscode copilot chat that prevents llama cpp usage with it. https://github.com/microsoft/vscode-copilot-chat/commit/0dd4ce55a75c68bb2a8b3d96ff345db871e0a418

Image

connected to this PR #12896

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions