-
Notifications
You must be signed in to change notification settings - Fork 93
Open
Description
Problem
The system uses all-MiniLM-L6-v2 as its embedding model (hardcoded default in retrievers.py). This is a 6-layer, 22M parameter model optimised for speed. It performs reasonably on general semantic similarity but struggles with:
- Short, terse, factual text (typical of memory notes)
- Domain-specific terminology
- Nuanced queries that require deeper semantic understanding
Options worth evaluating
| Model | Params | Notes |
|---|---|---|
all-MiniLM-L6-v2 (current) |
22M | Fast, low quality ceiling |
all-MiniLM-L12-v2 |
33M | Same family, 2× layers, meaningful quality bump for low cost |
all-mpnet-base-v2 |
109M | Best general-purpose SentenceTransformer, strong on short texts |
nomic-embed-text (via Ollama) |
— | Keeps everything local and on-GPU, fits the project's local-only stance |
Suggested approach
- Fix the L2 → cosine metric bug (bug: ChromaDB collection uses L2 distance instead of cosine — degrades semantic search quality #24) first so benchmarks are meaningful
- Run a small retrieval eval against the existing memory store with each model
- Make the model name configurable via
AMEM_EMBEDDING_MODELenv var (it's already a constructor parameter — just needs wiring to the env)
Note
Switching models on an existing persistent collection requires rebuilding the index (same migration caveat as #24). The MCP server's in-memory collection rebuilds fresh each session, so it's unaffected.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels