Skip to content

Refactor FAISS implementation to meet STANDARDS.md compliance #42

@amondnet

Description

@amondnet

Summary

Refactor packages/core/src/vectordb/faiss-vectordb.ts (897 LOC) to comply with STANDARDS.md file size limits (≤300 LOC) and improve code maintainability.

Background

After implementing FAISS vector database support (#13), the implementation file has grown to 897 lines, exceeding the STANDARDS.md guideline of ≤300 LOC per file. Additionally, several functions exceed the 50 LOC limit.

Current State

  • faiss-vectordb.ts: 897 LOC (target: ≤300 LOC)
  • Multiple functions >50 LOC:
    • loadCollection(): ~70 LOC
    • saveCollection(): ~65 LOC
    • hybridSearch(): ~80 LOC
    • applyRRF(): ~55 LOC

Proposed Solution

Phase 1: Type Extraction

  • Create faiss/faiss-types.ts for type definitions
  • Extract FaissConfig, CollectionMetadata, DocumentMetadata, CollectionData

Phase 2: Composition Pattern

Split into focused modules using composition:

  1. faiss/faiss-storage.ts (~250 LOC)

    • File I/O operations
    • loadCollection(), saveCollection()
    • Error handling for file operations
  2. faiss/faiss-indexer.ts (~250 LOC)

    • Search logic
    • search(), hybridSearch(), applyRRF()
    • BM25 integration
  3. faiss/faiss-vectordb.ts (~300 LOC)

    • Main class implementing VectorDatabase
    • Composition: uses FaissStorage and FaissIndexer
    • Public API orchestration

Phase 3: Function Refactoring

Break down large functions into smaller units (≤50 LOC):

  • loadCollection() → extract metadata/index/documents/bm25 loading
  • saveCollection() → extract individual save operations
  • hybridSearch() → extract dense/sparse search logic
  • applyRRF() → extract ranking calculation

Benefits

  • ✅ STANDARDS.md compliance (file ≤300 LOC, functions ≤50 LOC)
  • ✅ Single Responsibility Principle
  • ✅ Improved testability (each module can be tested independently)
  • ✅ Better maintainability and readability
  • ✅ Easier to add new features (e.g., different index types)

Risks

  • Breaking changes if not careful with exports
  • Need to update existing tests
  • More files to navigate

Acceptance Criteria

  • All files ≤300 LOC
  • All functions ≤50 LOC
  • All existing tests pass without modification
  • No breaking changes to public API
  • TypeScript compilation succeeds
  • Linting passes

Related

Priority

P2 - Important but not urgent (technical debt)

Estimate

2-3 hours


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions