Skip to content

Conversation

@sobychacko
Copy link
Contributor

…e inserts

Resolves #1199

  • Implement configurable maxDocumentBatchSize to prevent insert timeouts when adding large numbers of documents
  • Update PgVectorStore to process document inserts in controlled batches
  • Add maxDocumentBatchSize property to PgVectorStoreProperties
  • Update PgVectorStoreAutoConfiguration to use the new batching property
  • Add tests to verify batching behavior and performance

This change addresses the issue of PgVectorStore inserts timing out due to large document volumes. By introducing configurable batching, users can now control the insert process to avoid timeouts while maintaining performance and reducing memory overhead for large-scale document additions.

…or PgVectorStore inserts

Resolves spring-projects#1199

- Implement configurable maxDocumentBatchSize to prevent insert timeouts
  when adding large numbers of documents
- Update PgVectorStore to process document inserts in controlled batches
- Add maxDocumentBatchSize property to PgVectorStoreProperties
- Update PgVectorStoreAutoConfiguration to use the new batching property
- Add tests to verify batching behavior and performance

This change addresses the issue of PgVectorStore inserts timing out due to
large document volumes. By introducing configurable batching, users can now
control the insert process to avoid timeouts while maintaining performance
and reducing memory overhead for large-scale document additions.
@markpollack
Copy link
Member

merged in 202148d

@markpollack markpollack added this to the 1.0.0-M3 milestone Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When too much data is imported, timeouts may easily occur when executing the embedding model.

2 participants