-
-
Notifications
You must be signed in to change notification settings - Fork 96
Open
Description
Summary
On MSMARCO-1M vector search sits around 9.4 GB peak RSS. Raw vectors are ~4 GB; even adding ~1 GB index, a few hundred MB graph, and JVM/caches should land closer to ~6–7 GB. The observed resident set is noticeably higher than expected.
Observed
- Peak RSS: ~9.4 GB during ingest/build/search.
- Recall/accuracy unchanged; issue is memory footprint, not correctness.
Hypotheses
- Search may be touching both graph and document vector pages, keeping both resident (double-loading).
- Cache sizing/prefetch/eviction may be retaining more pages than needed.
Suggested next steps
- Add a lightweight counter/log on vector fetch to report source (graph vs documents) during search.
- Inspect page/mmap residency during search to see what is held in cache.
- If double-loading is confirmed, prefer a single vector source to reduce residency.
- Review cache sizing/eviction policy for graph/doc pages.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels