Skip to content

Eval bug: --embd-bge-small-en-default #16451

@jozefRudy

Description

@jozefRudy

Name and Version

llama-server --version
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: loaded in 0.022 sec
ggml_metal_device_init: GPU name: Apple M1
ggml_metal_device_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_device_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_device_init: simdgroup reduction = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory = true
ggml_metal_device_init: has bfloat = true
ggml_metal_device_init: use residency sets = true
ggml_metal_device_init: use shared buffers = true
ggml_metal_device_init: recommendedMaxWorkingSetSize = 11453.25 MB
version: 6690 (86df2c9)
built with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0

Operating systems

Linux, Mac

GGML backends

CPU

Hardware

M1

Models

bge-small-en-v1.5.gguf

Problem description & steps to reproduce

llama-server -m ./models/bge-small-en-v1.5-f16.gguf --host 0.0.0.0 --port 8081 --embedding --embd-bge-small-en-default

correct setting (working is as follows):

llama-server -m ./models/bge-small-en-v1.5-f16.gguf --host 0.0.0.0 --port 8081 --embedding -t 8 --embd-bge-small-en-default --pooling cls

(cls for this model, but pooling should not be left to none for any embedding model)

--embd-bge-small-en-default has very confusingly set pooling to None. So then when calling /v1/embeddings endpoint, it returns error:

{"error":{"code":4
      00,"message":"Pooling type 'none' is not OAI compatible. Please use a different pooling type","type":"invalid_request_erro
      r"}}

Have spent few hours today trying to understand why convenience setting would set pooling wrongly.

This is true of all such convenience settings for embedding models (pooling none)

First Bad Commit

No response

Relevant log output

{"error":{"code":4
      00,"message":"Pooling type 'none' is not OAI compatible. Please use a different pooling type","type":"invalid_request_erro
      r"}}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions