Skip to content

Commit 29698f0

Browse files
committed
docs: add README with ParadeDB setup and usage instructions
1 parent b853500 commit 29698f0

File tree

1 file changed

+90
-0
lines changed
  • llama-index-integrations/vector_stores/llama-index-vector-store-paradedb

1 file changed

+90
-0
lines changed
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# LlamaIndex Vector Stores Integration: ParadeDB
2+
3+
This module adds full ParadeDB integration enabling hybrid search with BM25 and vector similarity (HNSW) in PostgreSQL.
4+
5+
---
6+
7+
## Quick Setup
8+
9+
---
10+
11+
### 1. **Setup example**
12+
13+
Run ParadeDB locally:
14+
15+
```bash
16+
docker run --name paradedb \
17+
-e POSTGRES_USER=postgres \
18+
-e POSTGRES_PASSWORD=mark90 \
19+
-e POSTGRES_DB=postgres \
20+
-p 5432:5432 \
21+
-d paradedb/paradedb:latest
22+
```
23+
24+
---
25+
26+
### 2. **Usage example**
27+
28+
```python
29+
import os
30+
from dotenv import load_dotenv
31+
from sqlalchemy import make_url
32+
from llama_index.vector_stores.paradedb import ParadeDBVectorStore
33+
34+
def get_vector_store(table_name: str = "pgvector") -> ParadeDBVectorStore:
35+
"""
36+
Creates and returns a new ParadeDBVectorStore instance using environment variables.
37+
"""
38+
load_dotenv()
39+
40+
host = os.getenv("DB_HOST")
41+
port = os.getenv("DB_PORT")
42+
user = os.getenv("DB_USER")
43+
password = os.getenv("DB_PASSWORD")
44+
database = os.getenv("DB_DATABASE")
45+
46+
connection_string = f"postgresql://{user}:{password}@{host}:{port}"
47+
url = make_url(connection_string)
48+
49+
return ParadeDBVectorStore.from_params(
50+
database=database,
51+
host=url.host,
52+
password=url.password,
53+
port=url.port,
54+
user=url.username,
55+
table_name=table_name,
56+
text_search_config="english",
57+
hybrid_search=True, # needed to use bm25
58+
use_bm25=True,
59+
embed_dim=int(os.getenv("EMBEDDING_DIM")),
60+
hnsw_kwargs={
61+
"hnsw_m": 16,
62+
"hnsw_ef_construction": 64,
63+
"hnsw_ef_search": 40,
64+
"hnsw_dist_method": "vector_cosine_ops",
65+
},
66+
)
67+
```
68+
69+
---
70+
71+
### Notes
72+
73+
* Set `hybrid_search=True` and `use_bm25=True` to enable **hybrid BM25 + vector** retrieval.
74+
* You **must** use the `paradedb/paradedb:latest` image — not `pgvector/pgvector`.
75+
* The default schema name is `paradedb` to enable BM25.
76+
* Fully compatible with **llama-index-core** and other vector store interfaces.
77+
78+
---
79+
80+
### Disclaimer
81+
82+
This integration was based on the Postgres Vector Store implementation:
83+
84+
**version = "0.5.5"**
85+
86+
87+
However, **`customize_query_fn`** and other Postgres-specific query customization features are **not supported** in this ParadeDB version, as the focus here is on BM25 and hybrid retrieval.
88+
89+
Feel free to contribute and extend this module further.
90+

0 commit comments

Comments
 (0)