Skip to content

Add FAISS vector database support for local-only deployments #13

@amondnet

Description

@amondnet

Add FAISS Vector Database Support

Overview

Integrate FAISS as a local-only vector database option for the MCP server, providing a zero-configuration alternative to Milvus and Qdrant.

Motivation

  • Zero setup: No separate vector database server required
  • Lower barrier to entry: Users can start indexing immediately without cloud setup
  • Local development: Ideal for personal projects and small-to-medium codebases
  • Cost savings: No infrastructure or cloud service costs

Current State

  • [email protected] is already in dependencies but not used
  • Currently supports: Milvus (gRPC/RESTful) and Qdrant (gRPC)
  • Both require external server setup

Proposed Solution

Architecture

New file: packages/core/src/vectordb/faiss-vectordb.ts

Storage structure:

~/.context/faiss-indexes/
  └── {collection_hash}/
      ├── dense.index        # FAISS index file
      ├── sparse.json        # BM25 model (vocabulary, IDF)
      └── metadata.json      # Document metadata

Key Features

  1. File-based persistence: Store indexes as .faiss files
  2. Hybrid search: Reuse existing SimpleBM25 for keyword matching
  3. Auto-selection: Default to FAISS when no external DB configured
  4. Same interface: Implement VectorDatabase interface

Implementation Plan

Phase 1: Core Implementation

  • Create FaissVectorDatabase class extending BaseVectorDatabase
  • Implement collection CRUD (create, drop, has, list)
  • File-based persistence (save/load index files)
  • Basic vector insert and search

Phase 2: Hybrid Search

  • Integrate SimpleBM25 for sparse vectors (reuse from Qdrant)
  • Implement RRF (Reciprocal Rank Fusion) reranking
  • Support hybridSearch() method
  • Serialize/deserialize BM25 model to JSON

Phase 3: Factory Integration

  • Add FAISS_LOCAL to VectorDatabaseType enum
  • Update VectorDatabaseFactory.create()
  • Export from packages/core/src/vectordb/index.ts

Phase 4: MCP Auto-Selection

  • Add auto-detection logic in packages/mcp/src/index.ts
  • Default to FAISS when MILVUS_ADDRESS, MILVUS_TOKEN, QDRANT_URL all undefined
  • Configure default storage directory

Phase 5: Testing & Documentation

  • Unit tests for FaissVectorDatabase
  • Integration tests for indexing + hybrid search workflow
  • Update README with FAISS usage examples
  • Add troubleshooting guide

Technical Details

FAISS Index Type: IndexFlatL2 (L2 distance, suitable for medium-scale datasets)

BM25 Integration: Reuse packages/core/src/vectordb/sparse/simple-bm25.ts

Hybrid Search Flow:

  1. Dense search: faissIndex.search(queryVector, topK)
  2. Sparse search: bm25.score(queryText, documents)
  3. RRF fusion: Merge results with reciprocal rank scoring

Auto-selection logic:

if (!process.env.MILVUS_ADDRESS && !process.env.QDRANT_URL) {
  vectorDatabase = VectorDatabaseFactory.create(
    VectorDatabaseType.FAISS_LOCAL,
    { storageDir: '~/.context/faiss-indexes' }
  );
}

Limitations & Tradeoffs

Advantages:

  • ✅ Zero external dependencies
  • ✅ Fast in-memory search
  • ✅ Simple file-based storage
  • ✅ Perfect for local development

Limitations:

  • ⚠️ Memory constraints (entire index loads into RAM)
  • ⚠️ Limited concurrency (file locking needed)
  • ⚠️ Scalability limit (~100K files / 1M vectors)
  • ⚠️ No advanced filtering like Milvus/Qdrant

Files to Modify

New files:

  • packages/core/src/vectordb/faiss-vectordb.ts (~250 LOC)
  • packages/core/test/vectordb/faiss-vectordb.test.ts (~150 LOC)
  • packages/core/test/integration/faiss-integration.test.ts (~100 LOC)

Modified files:

  • packages/core/src/vectordb/factory.ts (~20 LOC change)
  • packages/core/src/vectordb/index.ts (~5 LOC change)
  • packages/mcp/src/index.ts (~30 LOC change)
  • README.md (documentation updates)

Estimated total: ~500 LOC

Success Criteria

  • Users can index codebase without external DB setup
  • FAISS selected automatically when no other DB configured
  • Hybrid search achieves similar quality to Qdrant/Milvus
  • All tests pass (unit + integration)
  • Documentation includes FAISS quick start guide

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions