Introduce index serialization mode option to support serializing graph in memory by rchitale7 · Pull Request #124 · opensearch-project/remote-vector-index-builder

rchitale7 · 2025-10-15T16:26:07Z

Description

This change introduces the option of serializing the faiss graph in memory, instead of writing to disk. For background on why this can be helpful, see the issue description here: #97. For the reasoning behind the solution I took, please see the solution here: #97 (comment). TLDR: I did not end up using faiss.serialize_index, due its inefficient memory usage. Instead, I used faiss.PyCallbackIOWriter to stream directly to a bytes buffer.

Note that writing the graph to disk is still the default option. The consumer of this library needs to specify the index_storage_mode via IndexBuildParameters as IndexStorageMode.MEMORY to serialize in memory.

Testing

I tested this change manually and verified it works. Will need to do some more benchmarking as a follow up.

Issues Resolved

Resolves #97.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

navneet1v · 2025-10-15T19:17:13Z

@rchitale7 please add a benchmarking results for the change

remote_vector_index_builder/core/common/models/index_build_parameters.py

...ector_index_builder/core/common/models/index_builder/faiss/faiss_index_hnsw_cagra_builder.py

rchitale7 · 2025-10-24T08:22:46Z

Serializing the faiss index in memory instead of writing to disk impacts the performance in two places, in the overall remote build flow:

The faiss write index step (since we are now serializing the index in memory instead of disk)
Uploading the faiss index to s3 (since we are now uploading to s3 from a memory buffer instead of from disk)

All other components of the remote build flow should remain the same. Serializing also impacts CPU memory usage; we must now maintain an additional copy of the serialized graph in memory, instead of keeping it on disk. GPU memory usage should also be unaffected.

To benchmark this serialization change, I used the ms-marco-384, cohere-10M-768-IP, and open-ai-1536 datasets that can be found in benchmarks.yml. I compared the performance of serializing the index in memory v.s. disk for the 'write index' and 'upload to s3' steps. I also compared the peak CPU memory usage for memory v.s. disk. Finally, I validated that serializing in memory does not impact the recall. I also validated the performance for all other aspects of the remote index build flow remained the same.

I've added the benchmarking scripts I used to this PR, and this README explains how I (and anyone else) use these scripts.

Here is a table comparing the performance and memory impacts for serializing in memory v.s. disk:

Dataset	Serialization Method	Dataset Size (MB)	Write Index Time (s)	S3 Upload Time (s)	Peak CPU Memory (MB)
ms-marco-384	Memory	1430.51	0.88	6.95	5084.59
ms-marco-384	Disk	1430.51	0.82	6.02	4087.42
cohere-10M-768-IP	Memory	28610.32	16.58	132.67	91575.0
cohere-10M-768-IP	Disk	28610.32	83.90	109.14	64057.26
open-ai-1536	Memory	5722.05	3.29	25.78	18328. 68
open-ai-1536	Disk	5722.05	3.06	21.59	12878.24

As expected, the peak CPU memory is roughly [dataset size] MB larger when the index is serialized in memory v.s. stored on disk. This is because the additional copy of the faiss graph contains all of the vectors.

There does not appear to be a performance benefit for serializing in memory for the marco and open ai datasets. However, for the much larger cohere dataset, we get a small boost when serializing in memory. The write index step is 57 seconds faster, but uploading to s3 is 23 seconds slower. This results in about a ~30 second bump in performance.

More investigation needs to be done to improve the s3 upload times for serializing in memory. One reason may be that we need to tune the multipart upload settings for boto3.upload_fileobj (which is used for uploading from memory -> s3) differently than boto3.upload_file (which is used for uploading from disk > s3).

…h in memory Signed-off-by: Rohan Chitale <rchital@amazon.com>

rchitale7 · 2025-10-29T22:04:33Z

I realized we can further optimize the memory usage, by freeing the vectors from memory after GPU->CPU conversion. Here is the updated memory graph:

I've updated the code to free the vectors after conversion, but before serialization. This gives us a similar memory impact to writing to disk. Here is the updated table comparing performance and memory impacts for serializing in memory v.s. disk:

Dataset	Serialization Method	Dataset Size (MB)	Write Index Time (s)	S3 Upload Time (s)	Peak CPU Memory (MB)
ms-marco-384	Memory	1430.51	0.88	6.95	4017.32
ms-marco-384	Disk	1430.51	0.82	6.02	4087.42
cohere-10M-768-IP	Memory	28610.32	16.58	132.67	64769.87
cohere-10M-768-IP	Disk	28610.32	83.90	109.14	64057.26
open-ai-1536	Memory	5722.05	3.29	25.78	12833.53
open-ai-1536	Disk	5722.05	3.06	21.59	12878.24

I've also updated the comment on the GH issue with this new graph: #97 (comment)

rchitale7 force-pushed the serialization_changes branch 2 times, most recently from a4a060e to 72035bd Compare October 15, 2025 17:23

rchitale7 marked this pull request as ready for review October 15, 2025 17:39

rchitale7 requested review from Rajrahane, jed326, navneet1v, neetikasinghal, owenhalpert, vamshin and yigithub as code owners October 15, 2025 17:39

navneet1v reviewed Oct 15, 2025

View reviewed changes

remote_vector_index_builder/core/common/models/index_build_parameters.py Outdated Show resolved Hide resolved

remote_vector_index_builder/core/common/models/index_build_parameters.py Outdated Show resolved Hide resolved

jed326 reviewed Oct 16, 2025

View reviewed changes

...ector_index_builder/core/common/models/index_builder/faiss/faiss_index_hnsw_cagra_builder.py Outdated Show resolved Hide resolved

rchitale7 force-pushed the serialization_changes branch 2 times, most recently from 079fd54 to 5e4de8a Compare October 24, 2025 07:41

rchitale7 changed the title ~~Introduce index storage mode option to support serializing graph in memory~~ Introduce index serialization mode option to support serializing graph in memory Oct 27, 2025

jed326 previously approved these changes Oct 27, 2025

View reviewed changes

Introduce index serialization mode option to support serializing grap…

6c33cf4

…h in memory Signed-off-by: Rohan Chitale <rchital@amazon.com>

rchitale7 dismissed jed326’s stale review via 6c33cf4 October 29, 2025 21:57

rchitale7 force-pushed the serialization_changes branch from 5e4de8a to 6c33cf4 Compare October 29, 2025 21:57

jed326 approved these changes Oct 29, 2025

View reviewed changes

navneet1v approved these changes Nov 4, 2025

View reviewed changes

navneet1v added the enhancement New feature or request label Nov 4, 2025

rchitale7 merged commit a14ef46 into opensearch-project:main Nov 4, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce index serialization mode option to support serializing graph in memory#124

Introduce index serialization mode option to support serializing graph in memory#124
rchitale7 merged 1 commit intoopensearch-project:mainfrom
rchitale7:serialization_changes

rchitale7 commented Oct 15, 2025 •

edited

Loading

Uh oh!

navneet1v commented Oct 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rchitale7 commented Oct 24, 2025 •

edited

Loading

Uh oh!

rchitale7 commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rchitale7 commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Issues Resolved

Uh oh!

navneet1v commented Oct 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rchitale7 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rchitale7 commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rchitale7 commented Oct 15, 2025 •

edited

Loading

rchitale7 commented Oct 24, 2025 •

edited

Loading