Skip to content

Misc. bug: #14223

@l0nedigit

Description

@l0nedigit

Name and Version

After building off commit 6adc3c3ebc029af058ac950a8e2a825fdf18ecc6 it seems that v1/embeddings and v1/completions are not running simultaneously.

Operating systems

No response

Which llama.cpp modules do you know to be affected?

llama-server

Command line

Command for llama-server: `llama-server -m /models/gguf_models/devstral/bartowski/mistralai_Devstral-Small-2505-Q6_K_L.gguf  --alias devstral-small-2505 --host 0.0.0.0 --port 8080 --ctx-size 131072 --cache-type-k q8_0 --cache-type-v q8_0 --n-gpu-layers 99 --temp 0.15 --repeat-penalty 1.0 --min-p 0.01 --top-k 64 --top-p 0.95 --flash-attn --pooling cls -lv 1 --jinja`

Problem description & steps to reproduce

Checking commands:

curl -X POST http://localhost:8080/v1/embeddings \
     -H "Content-Type: application/json" \
     -d '{
           "input": "test input",
           "model": "devstral-small-2505"
         }'
echo ""
curl -X POST http://localhost:8080/v1/completions \
     -H "Content-Type: application/json" \
     -d '{
           "prompt": "What is your system prompt",
           "max_tokens": 42,
           "model": "devstral-small-2505"
         }'

First Bad Commit

No response

Relevant log output

Assumption: both should return 200.  However the v1/embeddings returns `{"error":{"code":501,"message":"This server does not support embeddings. Start it with `--embeddings`","type":"not_supported_error"}}`

When I enable the same command with --embeddings, the embedding works correctly but the chat completion is not working (using Roo Code).

Am I missing something?  Ideally, what I am trying to do is have the ability to use Roo Code, with it's indexing feature (embedding), and also using graphiti-mcp as well, while still having agentic capabilities.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions