-
Notifications
You must be signed in to change notification settings - Fork 186
Description
Description
When using qdrant-client in local mode (:memory: or path-based), the score_threshold parameter is not applied to FusionQuery (RRF/DBSF) results. The same parameter works correctly for regular vector searches and when connecting to a remote Qdrant server.
Proposed Fix:
Root Cause
The bug is in qdrant_client/local/local_collection.py in the _merge_sources function (around lines 800-839).
The function receives score_threshold as a parameter but never uses it for FusionQuery or RrfQuery:
def _merge_sources(
self,
sources: list[list[types.ScoredPoint]],
query: types.Query,
limit: int,
offset: int,
using: Optional[str] = None,
query_filter: Optional[types.Filter] = None,
score_threshold: Optional[float] = None, # <-- RECEIVED BUT NEVER USED FOR FUSION
...
) -> list[types.ScoredPoint]:
if isinstance(query, (models.FusionQuery, models.RrfQuery)):
# ... fusion logic ...
return fused[offset:] # <-- score_threshold NOT applied!Compare to the Qdrant server implementation which correctly applies score_threshold:
https://github.com/qdrant/qdrant/blob/main/lib/collection/src/shards/local_shard/query.rs#L430-L438
let top_fused: Vec<_> = if let Some(score_threshold) = score_threshold {
fused
.into_iter()
.take_while(|point| point.score >= score_threshold)
.take(limit)
.collect()
} else {
fused.into_iter().take(limit).collect()
};Reproduction
import asyncio
from qdrant_client import AsyncQdrantClient, models
async def main():
client = AsyncQdrantClient(":memory:") # Local mode
await client.create_collection(
collection_name="test",
vectors_config={"dense": models.VectorParams(size=4, distance=models.Distance.COSINE)},
)
await client.upsert(
collection_name="test",
points=[
models.PointStruct(id=1, vector={"dense": [1.0, 0.0, 0.0, 0.0]}, payload={"name": "a"}),
models.PointStruct(id=2, vector={"dense": [0.9, 0.1, 0.0, 0.0]}, payload={"name": "b"}),
models.PointStruct(id=3, vector={"dense": [0.5, 0.5, 0.0, 0.0]}, payload={"name": "c"}),
models.PointStruct(id=4, vector={"dense": [0.0, 1.0, 0.0, 0.0]}, payload={"name": "d"}),
],
)
# BUG: All 4 results returned despite score_threshold=0.4
results = await client.query_points(
collection_name="test",
prefetch=[models.Prefetch(query=[1.0, 0.0, 0.0, 0.0], using="dense", limit=10)],
query=models.FusionQuery(fusion=models.Fusion.RRF),
limit=10,
score_threshold=0.4, # Should filter results with score < 0.4
)
print(f"Results: {len(results.points)}")
for p in results.points:
print(f" id={p.id}, score={p.score:.4f}")
# Output shows results with score < 0.4:
# id=1, score=0.5000
# id=2, score=0.3333 <-- below threshold!
# id=3, score=0.2500 <-- below threshold!
# id=4, score=0.2000 <-- below threshold!
asyncio.run(main())Expected Behavior
Fusion query results should be filtered by score_threshold, returning only results with score >= score_threshold.
Actual Behavior
All fusion results are returned regardless of score_threshold.
Suggested Fix
Apply score_threshold filtering after fusion in _merge_sources:
if isinstance(query, (models.FusionQuery, models.RrfQuery)):
# ... existing fusion logic ...
# Apply score_threshold (matching server behavior)
if score_threshold is not None:
fused = [p for p in fused if p.score >= score_threshold]
# Fetch payload and vectors
# ...
return fused[offset:]Environment
- qdrant-client version: 1.13.3
- Python version: 3.12
- Affects: Local mode only (
:memory:and path-based storage) - Does NOT affect: Remote server connections (server applies threshold correctly)