[Beta Support]: llamacpp /props fallback for llama-swap proxy #22747

daptify14 · 2026-04-03T17:29:43Z

daptify14
Apr 3, 2026

Describe the problem you are having

After #22737, the llamacpp provider queries GET /props?model=<name> to auto-detect context size, modalities, and tool support. This works with direct llama-server and llama.cpp router mode, but returns 404 behind llama-swap.

When /props fails, _supports_vision and _supports_tools default to False. Doesn't look like these flags are used yet, but could matter if they are in the future.

llama-swap exposes per-model passthrough at /upstream/<name>/props, tested it and the response structure seems identical to /props?model=<name>. I know the docs only mention llama.cpp / llama-server directly, but could it make sense to try that path as a fallback in _init_provider() before falling back to defaults?

Beta Version

0.18.0-68dfb15

Issue Category

Other

Frigate config file

genai:
  default:
    provider: llamacpp
    base_url: http://10.10.0.99:8080
    model: qwen3.5-9b
    provider_options:
      context_size: 65536
      chat_template_kwargs:
        enable_thinking: false
    roles:
      - embeddings
      - vision
      - tools

review:
  genai:
    enabled: false

cameras:
  driveway:
    review:
      genai:
        enabled: true
        alerts: true
        detections: false
        image_source: recordings
  front_door:
    review:
      genai:
        enabled: true
        alerts: true
        detections: false
        image_source: recordings
  back_door:
    review:
      genai:
        enabled: true
        alerts: true
        detections: false
        image_source: recordings
  basement_door:
    review:
      genai:
        enabled: false

Relevant Frigate log output

Failed to query llama.cpp /props endpoint: 404 Client Error: Not Found for url: http://10.10.0.99:8080/props?model=qwen3.5-9b. Using defaults for context size and capabilities.

Relevant go2rtc log output (if applicable)

No response

Install method

Proxmox via Docker

docker-compose file or Docker CLI command

services:
  frigate:
    container_name: frigate
    image: ghcr.io/blakeblackshear/frigate:68dfb15-tensorrt
    restart: unless-stopped
    shm_size: 4gb
    devices:
      - nvidia.com/gpu=all
      - /dev/dri/renderD129:/dev/dri/renderD129
    volumes:
      - /opt/docker/app_data/frigate/config:/config
      - /media/frigate/recordings:/media/frigate
      - /etc/localtime:/etc/localtime:ro
      - type: tmpfs
        target: /tmp/cache
        tmpfs:
          size: 1000000000
    ports:
      - "8971:8971"
      - "8554:8554"
      - "8555:8555/tcp"
      - "8555:8555/udp"
    environment:
      FRIGATE_RTSP_PASSWORD: ${FRIGATE_RTSP_PASSWORD}
      PLUS_API_KEY: ${PLUS_API_KEY}

Operating system

Debian

CPU / GPU / Hardware

No response

Screenshots

No response

Steps to reproduce

No response

Any other information that may be helpful

No response

NickM-27 · 2026-04-03T17:31:14Z

NickM-27
Apr 3, 2026
Collaborator Sponsor

Thanks, that is odd that it does not follow the same scheme, will see what it looks like to try and support that

5 replies

daptify14 Apr 4, 2026
Author

Thank you, just tried the build with PR #22752 and the fallback worked well. Minor issue with UI validation with my existing config, provider_options validation expects strings, so quoting values works fine, but anything nested like chat_template_kwargs can't be quoted and won't validate in the UI.

hawkeye217 Apr 4, 2026
Collaborator

Since there can be nested options, it's probably best to make the provider_options and runtime_options just render as YAML fields in the UI. That way, it would look exactly like your existing config file and would be easy to update from the UI. I'll make that change in an upcoming PR.

NickM-27 Apr 4, 2026
Collaborator Sponsor

Makes sense to me, the type needs to change because right now the config linter gives a warning

hawkeye217 Apr 4, 2026
Collaborator

This is improved in #22756

daptify14 Apr 5, 2026
Author

Thanks for the quick fix on the UI validation in #22756!
Regarding the original fallback change, while it works well if the model that is set in Frigate's genai config is already loaded in llama-swap, when another model is active, /upstream/X/props triggers a model swap that exceeds the 10s timeout. A configurable timeout doesn't really help since load times vary by model/hardware.
Tested a workaround: falling back to /v1/models metadata, which llama-swap serves from config instantly without loading the model (but is manually set):
llama-swap config:

models:
  "qwen3.5-9b":
    metadata:
      n_ctx: 65536
      modalities:
        vision: true
        audio: false
      chat_template_caps:
        supports_tools: true

Frigate fallback (after /props and /upstream/X/props both fail):

for model in response.json().get("data", []):
    if model.get("id") == configured_model:
        meta = model.get("meta", {}).get("llamaswap", {})
        if meta:
            props = {
                "default_generation_settings": {"n_ctx": meta.get("n_ctx")},
                "modalities": meta.get("modalities", {}),
                "chat_template_caps": meta.get("chat_template_caps", {}),
            }

frigate.genai.llama_cpp  INFO : llama.cpp /props endpoints unavailable for 'qwen3.5-9b', trying /v1/models metadata fallback.
frigate.genai.llama_cpp  INFO : llama.cpp model 'qwen3.5-9b' initialized — context: 65536, vision: True, audio: False, tools: True

The metadata used there just an example, supports any key-value format. Alternatively, could allow manually predefining/overriding capabilities in Frigate's config, but the model discovery is useful when testing new models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Beta Support]: llamacpp /props fallback for llama-swap proxy #22747

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Beta Support]: llamacpp /props fallback for llama-swap proxy #22747

Uh oh!

daptify14 Apr 3, 2026

Describe the problem you are having

Beta Version

Issue Category

Frigate config file

Relevant Frigate log output

Relevant go2rtc log output (if applicable)

Install method

docker-compose file or Docker CLI command

Operating system

CPU / GPU / Hardware

Screenshots

Steps to reproduce

Any other information that may be helpful

Replies: 1 comment · 5 replies

Uh oh!

NickM-27 Apr 3, 2026 Collaborator Sponsor

Uh oh!

daptify14 Apr 4, 2026 Author

Uh oh!

hawkeye217 Apr 4, 2026 Collaborator

Uh oh!

NickM-27 Apr 4, 2026 Collaborator Sponsor

Uh oh!

hawkeye217 Apr 4, 2026 Collaborator

Uh oh!

daptify14 Apr 5, 2026 Author

daptify14
Apr 3, 2026

Replies: 1 comment 5 replies

NickM-27
Apr 3, 2026
Collaborator Sponsor

daptify14 Apr 4, 2026
Author

hawkeye217 Apr 4, 2026
Collaborator

NickM-27 Apr 4, 2026
Collaborator Sponsor

hawkeye217 Apr 4, 2026
Collaborator

daptify14 Apr 5, 2026
Author