-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Add FAISS Vector Database Support
Overview
Integrate FAISS as a local-only vector database option for the MCP server, providing a zero-configuration alternative to Milvus and Qdrant.
Motivation
- Zero setup: No separate vector database server required
- Lower barrier to entry: Users can start indexing immediately without cloud setup
- Local development: Ideal for personal projects and small-to-medium codebases
- Cost savings: No infrastructure or cloud service costs
Current State
[email protected]is already in dependencies but not used- Currently supports: Milvus (gRPC/RESTful) and Qdrant (gRPC)
- Both require external server setup
Proposed Solution
Architecture
New file: packages/core/src/vectordb/faiss-vectordb.ts
Storage structure:
~/.context/faiss-indexes/
└── {collection_hash}/
├── dense.index # FAISS index file
├── sparse.json # BM25 model (vocabulary, IDF)
└── metadata.json # Document metadata
Key Features
- File-based persistence: Store indexes as
.faissfiles - Hybrid search: Reuse existing
SimpleBM25for keyword matching - Auto-selection: Default to FAISS when no external DB configured
- Same interface: Implement
VectorDatabaseinterface
Implementation Plan
Phase 1: Core Implementation
- Create
FaissVectorDatabaseclass extendingBaseVectorDatabase - Implement collection CRUD (create, drop, has, list)
- File-based persistence (save/load index files)
- Basic vector insert and search
Phase 2: Hybrid Search
- Integrate
SimpleBM25for sparse vectors (reuse from Qdrant) - Implement RRF (Reciprocal Rank Fusion) reranking
- Support
hybridSearch()method - Serialize/deserialize BM25 model to JSON
Phase 3: Factory Integration
- Add
FAISS_LOCALtoVectorDatabaseTypeenum - Update
VectorDatabaseFactory.create() - Export from
packages/core/src/vectordb/index.ts
Phase 4: MCP Auto-Selection
- Add auto-detection logic in
packages/mcp/src/index.ts - Default to FAISS when
MILVUS_ADDRESS,MILVUS_TOKEN,QDRANT_URLall undefined - Configure default storage directory
Phase 5: Testing & Documentation
- Unit tests for
FaissVectorDatabase - Integration tests for indexing + hybrid search workflow
- Update README with FAISS usage examples
- Add troubleshooting guide
Technical Details
FAISS Index Type: IndexFlatL2 (L2 distance, suitable for medium-scale datasets)
BM25 Integration: Reuse packages/core/src/vectordb/sparse/simple-bm25.ts
Hybrid Search Flow:
- Dense search:
faissIndex.search(queryVector, topK) - Sparse search:
bm25.score(queryText, documents) - RRF fusion: Merge results with reciprocal rank scoring
Auto-selection logic:
if (!process.env.MILVUS_ADDRESS && !process.env.QDRANT_URL) {
vectorDatabase = VectorDatabaseFactory.create(
VectorDatabaseType.FAISS_LOCAL,
{ storageDir: '~/.context/faiss-indexes' }
);
}Limitations & Tradeoffs
Advantages:
- ✅ Zero external dependencies
- ✅ Fast in-memory search
- ✅ Simple file-based storage
- ✅ Perfect for local development
Limitations:
⚠️ Memory constraints (entire index loads into RAM)⚠️ Limited concurrency (file locking needed)⚠️ Scalability limit (~100K files / 1M vectors)⚠️ No advanced filtering like Milvus/Qdrant
Files to Modify
New files:
packages/core/src/vectordb/faiss-vectordb.ts(~250 LOC)packages/core/test/vectordb/faiss-vectordb.test.ts(~150 LOC)packages/core/test/integration/faiss-integration.test.ts(~100 LOC)
Modified files:
packages/core/src/vectordb/factory.ts(~20 LOC change)packages/core/src/vectordb/index.ts(~5 LOC change)packages/mcp/src/index.ts(~30 LOC change)README.md(documentation updates)
Estimated total: ~500 LOC
Success Criteria
- Users can index codebase without external DB setup
- FAISS selected automatically when no other DB configured
- Hybrid search achieves similar quality to Qdrant/Milvus
- All tests pass (unit + integration)
- Documentation includes FAISS quick start guide
References
- faiss-node: https://github.com/ewfian/faiss-node
- FAISS docs: https://faiss.ai/
- NPM stats: 1.2M downloads/year (active package)
- Existing SimpleBM25:
packages/core/src/vectordb/sparse/simple-bm25.ts
Metadata
Metadata
Assignees
Labels
No labels