-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Open
Description
π Describe the bug
[Refactor] Upgrade MongoDB Vector Store from legacy knnBeta to stable vectorSearch
π Summary
The current implementation of mem0/vector_stores/mongodb.py uses the deprecated knnVector index type and legacy index definition structure. MongoDB Atlas has moved the Vector Search feature to General Availability (GA), changing the syntax for index creation and search.
π The Problem
In the create_col method (lines 66-83), the code defines the index using the legacy mappings syntax:
# CURRENT CODE (Deprecated)
definition={
"mappings": {
"dynamic": False,
"fields": {
"embedding": {
"type": "knnVector", # <--- DEPRECATED
"dimensions": self.embedding_model_dims,
"similarity": self.SIMILARITY_METRIC,
}
},
}
}Issues:
- Deprecation:
knnVectoris legacy. The correct type isvector. - Recall/Accuracy: The
searchmethod setsnumCandidatesto be equal tolimit(line 144). For HNSW indexes,numCandidatesshould be significantly higher (10x-20x) than the limit to ensure accurate results.
π Proposed Solution
Update the MongoDB class to use the stable vectorSearch index type and numDimensions.
1. Update Index Creation (create_col)
Replace the mappings definition with the fields list format required for type="vectorSearch".
# NEW IMPLEMENTATION
search_index_model = SearchIndexModel(
name=self.index_name,
type="vectorSearch", # Explicitly set index type
definition={
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": self.embedding_model_dims, # Note: 'numDimensions' not 'dimensions'
"similarity": self.SIMILARITY_METRIC,
}
]
},
)2. Update Search Logic (search)
In the $vectorSearch pipeline (line 144), increase numCandidates to improve search accuracy.
# NEW IMPLEMENTATION
pipeline = [
{
"$vectorSearch": {
"index": self.index_name,
"path": "embedding",
"queryVector": vectors,
"limit": limit,
"numCandidates": limit * 20, # Recommended: 10x-20x the limit
}
},
{"$set": {"score": {"$meta": "vectorSearchScore"}}},
{"$project": {"embedding": 0}},
]β Acceptance Criteria
- Index creation uses
type="vectorSearch". - Field definition uses
numDimensionsinstead ofdimensions. - Search pipeline explicitly sets
numCandidates>limit. -
knnVectorreferences removed.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels