Skip to content

Conversation

Restodecoca
Copy link
Contributor

Description

This PR introduces ParadeDB vector store integration, extending the PostgreSQL-based store to support BM25 + vector search using ParadeDB.

This implementation is based on llama-index-vector-stores-postgres version 0.5.5, adapted for ParadeDB’s BM25.

This PR:

  • Adds full ParadeDB compatibility (schema, extensions, and BM25 index creation).
  • Supports hybrid dense + sparse (BM25) retrieval through ParadeDB’s pg_search extension.
  • Includes a new README.md describing setup, Docker usage, and configuration examples.

Fixes

Fixes #
(or leave blank if this is a new feature without a linked issue)


New Package?

  • Yesllama-index-vector-stores-paradedb
  • No

A detailed README.md was added with usage examples, setup instructions, and integration notes.
The tool.poetry.dependencies.llama-index-vector-stores-postgres reference is also declared in pyproject.toml.


Version Bump

  • Yes — bumped version to 0.1.0
  • No

Type of Change

  • New feature (non-breaking change that adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

image

Suggested Checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation (README.md)
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added Google Colab support for the newly added notebooks.
  • I have added tests that prove my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run ruff check --fix . and uv run ruff format . to appease the lint gods

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Oct 4, 2025
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should mention install command

Copy link
Collaborator

@logan-markewich logan-markewich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this really need to copy paste 1K lines from the postgres integration vs. just subclassing it?

@Restodecoca
Copy link
Contributor Author

Did this really need to copy past 1K lines from the postgres integration vs. just subclassing it?

Yeah, i think you're right, i'm gonna redo it by subclassing pgvector to simplify, thanks for the feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants