Skip to content

Vector search uses ~9.4 GB RSS for 1M INT8 vectors (~4 GB raw) #3144

@tae898

Description

@tae898

Summary
On MSMARCO-1M vector search sits around 9.4 GB peak RSS. Raw vectors are ~4 GB; even adding ~1 GB index, a few hundred MB graph, and JVM/caches should land closer to ~6–7 GB. The observed resident set is noticeably higher than expected.

Observed

  • Peak RSS: ~9.4 GB during ingest/build/search.
  • Recall/accuracy unchanged; issue is memory footprint, not correctness.

Hypotheses

  • Search may be touching both graph and document vector pages, keeping both resident (double-loading).
  • Cache sizing/prefetch/eviction may be retaining more pages than needed.

Suggested next steps

  1. Add a lightweight counter/log on vector fetch to report source (graph vs documents) during search.
  2. Inspect page/mmap residency during search to see what is held in cache.
  3. If double-loading is confirmed, prefer a single vector source to reduce residency.
  4. Review cache sizing/eviction policy for graph/doc pages.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions