You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/index.md
+18-7Lines changed: 18 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,16 @@
1
1
# pgEdge Vectorizer
2
2
3
-
pgEdge Vectorizer is a PostgreSQL extension that automatically chunks text content and generates vector embeddings using background workers. Vectorizer provides a seamless integration between your PostgreSQL database and embedding providers like OpenAI, making it easy to build AI-powered search and retrieval applications.
3
+
pgEdge Vectorizer is a PostgreSQL extension that automatically chunks text
4
+
content and generates vector embeddings using background workers. Vectorizer
5
+
provides a seamless integration between your PostgreSQL database and embedding
6
+
providers like OpenAI, making it easy to build AI-powered search and retrieval
7
+
applications.
4
8
5
9
pgEdge Vectorizer:
6
10
7
11
- intelligently splits text into optimal-sized chunks.
8
-
- handles embedding generation asynchronously using background workers without blocking.
12
+
- handles embedding generation asynchronously using background workers
13
+
without blocking.
9
14
- enables easy switching between OpenAI, Voyage AI, and Ollama.
10
15
- processes embeddings efficiently in batches for better API usage.
11
16
- automatically retries failed operations with exponential backoff.
@@ -14,14 +19,20 @@ pgEdge Vectorizer:
14
19
15
20
## pgEdge Vectorizer Architecture
16
21
17
-
pgEdge Vectorizer uses a trigger-based architecture with background workers to process text asynchronously. The following steps describe the processing flow from data insertion to embedding storage:
22
+
pgEdge Vectorizer uses a trigger-based architecture with background workers to
23
+
process text asynchronously. The following steps describe the processing flow
24
+
from data insertion to embedding storage:
18
25
19
26
1. A trigger detects INSERT or UPDATE operations on the configured table.
20
-
2. The chunking module splits the text into chunks using the configured strategy.
21
-
3. The system inserts chunk records and queue items into the processing queue.
22
-
4. Background workers pick up queue items using SKIP LOCKED for concurrent processing.
27
+
2. The chunking module splits the text into chunks using the configured
28
+
strategy.
29
+
3. The system inserts chunk records and queue items into the processing
30
+
queue.
31
+
4. Background workers pick up queue items using SKIP LOCKED for concurrent
32
+
processing.
23
33
5. The configured provider generates embeddings via its API.
24
-
6. The storage layer updates the chunk table with the generated embeddings.
34
+
6. The storage layer updates the chunk table with the generated
0 commit comments