-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
App Version
3.21.3
API Provider
OpenAI Compatible
Model Used
nomic-embed-code
π Steps to Reproduce
Hardcoded into roocode's experimental semantic code indexing and searching feature is the qdrant filter score_threshold to 0.4. The value is not a useful threshold for the nomic-embed-code model.
Additionally, there is text that is required to be included in the nomic-embed-code input.
The script below will help with debugging. It outputs the vector length and then three tests of search with different score_threshold values
β― cat qdrant_debug.sh
#!/usr/bin/env bash
#
# qdrant-debug.sh β collect evidence for a Roo-Code + Qdrant search bug
# Outputs GitHub-flavoured Markdown so you can paste it into an issue.
set -euo pipefail
QUERY=${1:-'user authentication'}
COLL="ws-0093205a6054d427"
EMBED_URL="http://localhost:8080/v1/embeddings"
QDRANT_URL="http://localhost:6333/collections/${COLL}/points/search"
# 1. embed the query ----------------------------------------------------------
EMB=$(curl -s -X POST "${EMBED_URL}" \
-H 'Content-Type: application/json' \
-d @- <<EOF | jq '.data[0].embedding'
{
"model": "nomic-embed-code",
"input": "Represent this query for searching relevant code: ${QUERY}"
}
EOF
)
vec_len=$(echo "${EMB}" | jq 'length')
echo "### π Query: \`${QUERY}\`"
echo "Vector length: **${vec_len}** (should be 3 584)\n"
# helper: run a search and pretty-print --------------------------------------
search () {
local threshold=$1
local limit=${2:-5}
local body
if [[ -z "${threshold}" ]]; then
body="{\"vector\": ${EMB}, \"limit\": ${limit}}"
else
body="{\"vector\": ${EMB}, \"limit\": ${limit}, \"score_threshold\": ${threshold}}"
fi
curl -s -X POST "${QDRANT_URL}" \
-H 'Content-Type: application/json' \
-d "${body}"
}
md_block () { echo -e '```json'; cat -; echo -e '```'; }
# 2. no threshold ------------------------------------------------------------
echo "### β
Top 5 results (no score_threshold)\n"
search "" 5 | md_block
# 3. strict threshold 0.40 ---------------------------------------------------
echo -e "\n### π« Results with \`score_threshold = 0.40\`\n"
json=$(search 0.4 50)
count=$(echo "${json}" | jq '.result | length')
echo "**Count:** ${count}"
echo "${json}" | md_block
# 4. lenient threshold 0.15 --------------------------------------------------
echo -e "\n### π‘ Results with \`score_threshold = 0.15\`\n"
json15=$(search 0.15 50)
count15=$(echo "${json15}" | jq '.result | length')
echo "**Count:** ${count15}"
echo "${json15}" | md_block
# 5. score distribution table ------------------------------------------------
echo -e "\n### π Score distribution (top 10, no threshold)\n"
printf '| Rank | Point ID | Score |\n|------|----------|-------|\n'
search "" 10 | jq -r '.result[] | [.score,(.payload.id//.id)] | @tsv' |
nl -w2 -s$'\t' | while IFS=$'\t' read -r rank score id; do
printf '| %s | `%s` | %.4f |\n' "${rank}" "${id}" "${score}"
done
π₯ Outcome Summary
This feature will not function with all models with a static score_threshold set to 0.4.
For nomic-embed-code the setting should be maybe 0.10 or 0.15. In my codebase the highest score for "user authentication" was 0.3347 as shown below.
We should consider either:
- querying for an appropriate threshold.
- having per-model settings in the code.
- adding a tunable for the threshold to the settings page.
ADDITIONALLY:
- "Represent this query for searching relevant code: " is required by the model. I'm unsure if this is unique to nomic-embed-code.
- https://simonwillison.net/2025/Mar/27/nomic-embed-code/
- https://huggingface.co/nomic-ai/nomic-embed-code
π Relevant Logs or Errors (Optional)
β― bash qdrant_debug.sh "user authentication"π Query: user authentication
Vector length: 3584 (should be 3584)
β Top 5 results (no score_threshold)
{"result":[{"id":"38f29157-c61f-5337-83d3-0f5cfb1ac767","version":10,"score":0.33473045},{"id":"a71a1212-8a2a-566f-8a6b-6a239023574f","version":10,"score":0.28562117},{"id":"7e27640d-d2df-573c-b4ed-329878f85ada","version":53,"score":0.2749608},{"id":"3e7866f4-aa60-5296-8aa6-69a11097743c","version":18,"score":0.2680832},{"id":"865812f7-a510-5ee3-9a75-121315534571","version":10,"score":0.26251754}],"status":"ok","time":0.000835419}
π« Results with score_threshold = 0.40
Count: 0
{"result":[],"status":"ok","time":0.000705669}
π‘ Results with score_threshold = 0.15
Count: 50
{"result":[{"id":"38f29157-c61f-5337-83d3-0f5cfb1ac767","version":10,"score":0.33473045},{"id":"a71a1212-8a2a-566f-8a6b-6a239023574f","version":10,"score":0.28562117},{"id":"7e27640d-d2df-573c-b4ed-329878f85ada","version":53,"score":0.2749608},{"id":"3e7866f4-aa60-5296-8aa6-69a11097743c","version":18,"score":0.2680832},{"id":"865812f7-a510-5ee3-9a75-121315534571","version":10,"score":0.26251754},{"id":"3a176d41-6774-5f9e-8c0c-780e5b399349","version":53,"score":0.24374674},{"id":"e781d01b-b5e4-5da9-9c25-4cd0205b843d","version":53,"score":0.24362029},{"id":"2658f249-669d-5043-b0f0-c1878a97c377","version":10,"score":0.24203128},{"id":"81f5c4a5-0869-557b-b5fe-7dfdbc2437b1","version":60,"score":0.23435998},{"id":"210cb401-7f8f-5c07-bbe0-982315e0cf94","version":34,"score":0.2278384},{"id":"16a4f811-979e-5427-a40f-27be99968043","version":12,"score":0.22783059},{"id":"86cf471c-23ff-5188-9466-a787876409c7","version":12,"score":0.22453013},{"id":"33660386-ff6c-5b21-a4a9-0876a7d7850b","version":17,"score":0.22215551},{"id":"c8e244d4-e0bf-53ee-a5cb-a63d6a0b09e6","version":53,"score":0.2219722},{"id":"727a9dde-0cba-5db0-bfab-d7b8ed551397","version":55,"score":0.22156988},{"id":"796074c9-7807-5bfa-a03c-bed4e8bbfb4d","version":10,"score":0.22054857},{"id":"5872c93d-b45f-5d0f-b08f-b812e0edf392","version":10,"score":0.21486811},{"id":"bf3e07ff-99a4-5473-bf28-205edc0ba0ce","version":10,"score":0.21298264},{"id":"f56f22ee-6e0d-5e64-a636-db596a337ccb","version":65,"score":0.21070097},{"id":"987a8e8e-f602-5bbb-a5d2-b29ef3fab2f8","version":55,"score":0.21006832},{"id":"adf7e7be-d755-5631-96e6-ed285e897293","version":55,"score":0.21006832},{"id":"74be01c2-1b24-5ae9-8b07-e80a0608b15a","version":55,"score":0.21006832},{"id":"a11b1ca5-4b4e-5681-9cfb-d657c259bdd8","version":55,"score":0.2100571},{"id":"1f76c7e0-a5b1-55cb-a1f4-c9df84cf02a2","version":10,"score":0.20893657},{"id":"83e575c6-9071-58a6-93e9-586e4f50941b","version":30,"score":0.20667195},{"id":"ce8dfd42-903a-549e-a60b-fb73582aad26","version":13,"score":0.2035707},{"id":"e46ca85c-8dee-594b-bb13-2e51dfb1e3f4","version":10,"score":0.19638401},{"id":"cdac8cc4-a9c3-5171-8cff-27e2ff77969b","version":31,"score":0.19503045},{"id":"211701b7-1c46-5bee-a6bc-0dd881c4c239","version":34,"score":0.19351412},{"id":"50ec5ccc-5ba8-52b1-8e2f-c1cee138a89d","version":15,"score":0.19167314},{"id":"1db2fdd0-4921-5ffe-b150-6da2d50ab5fb","version":10,"score":0.19081627},{"id":"f950d64f-6c0b-522b-bdf1-47abf6885645","version":14,"score":0.188555},{"id":"9bea4f7a-a51e-5ad3-b10d-6dfd14a8892d","version":77,"score":0.18814057},{"id":"1391843a-0006-50e5-a436-75944f52b05c","version":15,"score":0.18806493},{"id":"6baad56d-5200-5975-b143-2cf4ee054bab","version":10,"score":0.18738957},{"id":"785c2f5b-eff1-58d1-8864-674a75b044c6","version":30,"score":0.18574986},{"id":"d737af7d-7bbf-513c-acad-66e4e0179357","version":14,"score":0.18165587},{"id":"fa5b994c-e2f2-5e4c-b766-fec83c055dd8","version":14,"score":0.18165587},{"id":"e98ca363-2463-5f3a-ad35-0a6462026399","version":77,"score":0.1810772},{"id":"5adcfc65-6194-559c-9e8e-da2bcb5166d9","version":10,"score":0.1786707},{"id":"01217221-33d9-5218-a1be-eb546fb07f18","version":10,"score":0.1765967},{"id":"7d911de8-9908-5b59-853f-273c7f4de6de","version":53,"score":0.17619362},{"id":"62828bdd-c17f-506f-baac-2209a978c10a","version":15,"score":0.1745193},{"id":"9c2a4051-6665-531e-b484-652dadd2638a","version":15,"score":0.17087604},{"id":"699aad64-5925-5248-8e35-5c4a0fdfa57b","version":17,"score":0.17049417},{"id":"76301559-e095-51be-96e0-e29274a5e8c0","version":57,"score":0.16999713},{"id":"4333c069-295c-592c-8008-a495ff41bb99","version":77,"score":0.16764553},{"id":"0bd52652-2673-5b4f-baa4-7300388e5f05","version":13,"score":0.16714028},{"id":"303f9fe2-b63f-5a3c-930b-65eded701807","version":74,"score":0.1670577},{"id":"deabcbde-fe3b-599f-9133-f52707a3dbb6","version":41,"score":0.1628649}],"status":"ok","time":0.000702127}
π Score distribution (top 10, no threshold)
| Rank | Point ID | Score |
|---|---|---|
| 1 | 38f29157-c61f-5337-83d3-0f5cfb1ac767 |
0.3347 |
| 2 | a71a1212-8a2a-566f-8a6b-6a239023574f |
0.2856 |
| 3 | 7e27640d-d2df-573c-b4ed-329878f85ada |
0.2750 |
| 4 | 3e7866f4-aa60-5296-8aa6-69a11097743c |
0.2681 |
| 5 | 865812f7-a510-5ee3-9a75-121315534571 |
0.2625 |
| 6 | 3a176d41-6774-5f9e-8c0c-780e5b399349 |
0.2437 |
| 7 | e781d01b-b5e4-5da9-9c25-4cd0205b843d |
0.2436 |
| 8 | 2658f249-669d-5043-b0f0-c1878a97c377 |
0.2420 |
| 9 | 81f5c4a5-0869-557b-b5fe-7dfdbc2437b1 |
0.2344 |
| 10 | 210cb401-7f8f-5c07-bbe0-982315e0cf94 |
0.2278 |
Metadata
Metadata
Assignees
Labels
Type
Projects
Status