Skip to content

[FEATURE] Remove Redundant Vector Transfer in Remote Index Building for k-NN #94

@DNimmala5

Description

@DNimmala5

Problem

During remote index building with GPU acceleration, OpenSearch’s k-NN plugin sends vector data to the Remote Vector Index Builder (RVIB), which builds a FAISS index (IndexHNSWFlat) and sends it back. However, this index redundantly contains the same vectors that OpenSearch already holds locally, resulting in 50–70% unnecessary network transfer overhead.

Solution

Optimize the remote index build flow by transmitting only the HNSW graph structure back from the RVIB. OpenSearch will use its own local vector data to build a FlatIndex and then reconstruct the full IndexHNSWFlat using the graph.

[Design Document]
TBD

The flow of using remote index building would then change from this:

Image

To this:

Image

The 3 components of this solution are:

1. Flat Index Builder - in k-NN, vector data will be converted into a flat index that will store vector data. This flat index can be set into the full searchable index later.
2. Graph Extractor - in RVIB, the GPU service creates a GpuIndexCagra. This will be converted into a graph only IndexHNSWCagra, and then a graph only IndexHNSW.
3. Index Reconstructor - in k-NN, the flat index and graph only IndexHNSW will be combined into an IndexHNSWFlat, which can then be searched on.

Implementation Plan/Timeline

Task 1: Flat Index Builder PoC (k-NN)

Task: Build proof-of-concept for local vector storage (IndexFlat) creation via JNI

Technical Requirements:

  1. Retrieve vector data from OpenSearch using getKnnVectorValuesSupplier()
  2. Send full vector dataset from JVM to native side using a JNI function
  3. On native side, construct a FAISS IndexFlat with the passed vectors
  4. Return a native pointer to the constructed index back to the JVM
  5. This pointer can be passed later to the Index Reconstructor so it can access the flat index

Estimate: ✅ Completed

Task 2: Graph Extractor PoC (RVIB)

Task: Implement proof-of-concept for extracting graph-only FAISS index in RVIB

Technical Requirements:

  1. Modify consumed FAISS fork in RVIB such that GpuIndexCagra is converted into a graph only IndexHNSW
  2. Serialize graph-only IndexHNSW using faiss.write_index
  3. Upload to S3 using existing write_blob method
  4. Validate resulting graph-only index loads successfully

Estimate: June 27 (Fri)
Status: In Progress

Task 3: Index Reconstructor PoC (k-NN)

Task: Rebuild full IndexHNSWFlat locally using graph-only file and local IndexFlat

Technical Requirements:

  1. Download graph-only FAISS index from S3
  2. Send graph only file and flat index pointer through JNI to native
  3. Load graph-only index and flat index into i IndexHNSWFlat IndexHNSWFlat
  4. Use FAISS controls (prepare_level_tab, init_level0=false, etc.) to preserve graph
  5. Serialize final index and validate correctness against full index
  6. Evaluate Dooyong’s proposal to avoid post-merge step (separated approach)

Dependencies:

  • Flat Index Builder PoC (Task 1)
  • Graph Extractor PoC (Task 2)
  • Meeting planned to decide between this version of implementation or a version without reconstruction

Estimate: July 4 (Fri)
Status: Blocked on design decision

Task 4: Streamed Flat Index Builder Upgrade (k-NN)

Task: Refactor Flat Index Builder to support streaming vector ingestion from JVM

Technical Requirements:

  1. Replace current all-at-once vector JNI call with batch-based streaming
  2. Implement JNI methods to accept incremental vector additions
  3. Ensure sequential order is preserved for compatibility with FAISS graph merge
  4. Prevent memory overflow for large datasets
  5. Benchmark streaming vs. full-pass behavior

Dependencies:

  • Task 1 implementation (all-at-once version) must be stable
  • JNI must support safe buffer reuse and pointer tracking

Estimate: ~1.5 weeks after Task 3

Task 5: Optimizations and Validation

Task: Finalize correctness, performance, and fallback mechanisms

Technical Requirements:

  1. Add unit and integration tests for graph merge correctness
  2. Validate functional equivalence between reconstructed and full indices
  3. Benchmark network savings and query performance
  4. Add error handling and fallbacks (e.g., fallback to full index transfer)
  5. Tune streaming buffer sizes and clean up JNI resources

Estimate: Ongoing through dev cycle

PR for RVIB

Graph Extractor POC -
- The current flow is the GPU service creates GpuIndexCagra, then conversion process of: GpuIndexCagra -> IndexHNSWCagra(graph and vectors) -> IndexHNSWFlat(graph and vectors)
- The flow that would only send the graph structure would become: GpuIndexCagra -> IndexHNSWCagra(graph only) -> IndexHNSW(graph only)
- This IndexHNSW would the be serialized and uploaded to s3 the same way as the current implementation

Linked k-NN repo issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions