-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Problem
During remote index building with GPU acceleration, OpenSearch’s k-NN plugin sends vector data to the Remote Vector Index Builder (RVIB), which builds a FAISS index (IndexHNSWFlat) and sends it back. However, this index redundantly contains the same vectors that OpenSearch already holds locally, resulting in 50–70% unnecessary network transfer overhead.
Solution
Optimize the remote index build flow by transmitting only the HNSW graph structure back from the RVIB. OpenSearch will use its own local vector data to build a FlatIndex and then reconstruct the full IndexHNSWFlat using the graph.
[Design Document]
TBD
The flow of using remote index building would then change from this:
To this:
The 3 components of this solution are:
1. Flat Index Builder - in k-NN, vector data will be converted into a flat index that will store vector data. This flat index can be set into the full searchable index later.
2. Graph Extractor - in RVIB, the GPU service creates a GpuIndexCagra. This will be converted into a graph only IndexHNSWCagra, and then a graph only IndexHNSW.
3. Index Reconstructor - in k-NN, the flat index and graph only IndexHNSW will be combined into an IndexHNSWFlat, which can then be searched on.
Implementation Plan/Timeline
Task 1: Flat Index Builder PoC (k-NN)
Task: Build proof-of-concept for local vector storage (IndexFlat) creation via JNI
Technical Requirements:
- Retrieve vector data from OpenSearch using getKnnVectorValuesSupplier()
- Send full vector dataset from JVM to native side using a JNI function
- On native side, construct a FAISS IndexFlat with the passed vectors
- Return a native pointer to the constructed index back to the JVM
- This pointer can be passed later to the Index Reconstructor so it can access the flat index
Estimate: ✅ Completed
Task 2: Graph Extractor PoC (RVIB)
Task: Implement proof-of-concept for extracting graph-only FAISS index in RVIB
Technical Requirements:
- Modify consumed FAISS fork in RVIB such that GpuIndexCagra is converted into a graph only IndexHNSW
- Serialize graph-only IndexHNSW using faiss.write_index
- Upload to S3 using existing write_blob method
- Validate resulting graph-only index loads successfully
Estimate: June 27 (Fri)
Status: In Progress
Task 3: Index Reconstructor PoC (k-NN)
Task: Rebuild full IndexHNSWFlat locally using graph-only file and local IndexFlat
Technical Requirements:
- Download graph-only FAISS index from S3
- Send graph only file and flat index pointer through JNI to native
- Load graph-only index and flat index into i IndexHNSWFlat IndexHNSWFlat
- Use FAISS controls (prepare_level_tab, init_level0=false, etc.) to preserve graph
- Serialize final index and validate correctness against full index
- Evaluate Dooyong’s proposal to avoid post-merge step (separated approach)
Dependencies:
- Flat Index Builder PoC (Task 1)
- Graph Extractor PoC (Task 2)
- Meeting planned to decide between this version of implementation or a version without reconstruction
Estimate: July 4 (Fri)
Status: Blocked on design decision
Task 4: Streamed Flat Index Builder Upgrade (k-NN)
Task: Refactor Flat Index Builder to support streaming vector ingestion from JVM
Technical Requirements:
- Replace current all-at-once vector JNI call with batch-based streaming
- Implement JNI methods to accept incremental vector additions
- Ensure sequential order is preserved for compatibility with FAISS graph merge
- Prevent memory overflow for large datasets
- Benchmark streaming vs. full-pass behavior
Dependencies:
- Task 1 implementation (all-at-once version) must be stable
- JNI must support safe buffer reuse and pointer tracking
Estimate: ~1.5 weeks after Task 3
Task 5: Optimizations and Validation
Task: Finalize correctness, performance, and fallback mechanisms
Technical Requirements:
- Add unit and integration tests for graph merge correctness
- Validate functional equivalence between reconstructed and full indices
- Benchmark network savings and query performance
- Add error handling and fallbacks (e.g., fallback to full index transfer)
- Tune streaming buffer sizes and clean up JNI resources
Estimate: Ongoing through dev cycle
PR for RVIB
Graph Extractor POC -
- The current flow is the GPU service creates GpuIndexCagra, then conversion process of: GpuIndexCagra -> IndexHNSWCagra(graph and vectors) -> IndexHNSWFlat(graph and vectors)
- The flow that would only send the graph structure would become: GpuIndexCagra -> IndexHNSWCagra(graph only) -> IndexHNSW(graph only)
- This IndexHNSW would the be serialized and uploaded to s3 the same way as the current implementation