-
-
Notifications
You must be signed in to change notification settings - Fork 173
Description
Brief overview
Extend the restheart-ai plugin (Phase 1) with pluggable embedding and reranking providers backed by LangChain4j, enabling vector search workflows on any MongoDB instance — without relying on MongoDB's autoEmbed or the Atlas Reranking API.
Rationale
Phase 1 of restheart-ai leverages MongoDB's native autoEmbed for embedding generation and the Atlas Reranking API for reranking. This works well but has limitations:
autoEmbeduses only Voyage AI models — no choice of embedding model or provider- The Atlas Reranking API is available on Atlas only
- Some deployments need local/on-premise embedding (e.g., for data privacy or air-gapped environments)
- Some use cases require specific embedding models (e.g., domain-specific fine-tuned models)
Phase 2 adds pluggable providers that let RESTHeart generate embeddings and perform reranking using any LangChain4j-supported backend, while remaining fully backward-compatible with Phase 1.
Detailed documentation
Phase 2: Pluggable Providers
What Phase 2 adds on top of Phase 1
| Feature | Phase 1 | Phase 2 |
|---|---|---|
| Embedding generation | MongoDB autoEmbed |
Pluggable: OpenAI, Voyage AI, AWS Bedrock, ONNX, Ollama, or custom |
| Query vectorization | MongoDB queryString |
$vectorize operator (text → vector inline in pipelines) |
| Auto-embedding on write | MongoDB autoEmbed |
AutoEmbeddingInterceptor (RESTHeart generates embeddings on POST/PUT/PATCH) |
| Document chunking | Text extraction + storage (MongoDB embeds via autoEmbed) |
Text extraction + embedding generation + storage (RESTHeart embeds each chunk) |
| Reranking | Atlas Reranking API only | Pluggable: Cohere, Voyage AI (Atlas), ONNX, or custom |
Embedding Providers
Five out-of-the-box embedding providers, all backed by LangChain4j:
| Provider | Backend | Default Model | Type |
|---|---|---|---|
openAIEmbeddingProvider |
OpenAI API | text-embedding-3-small (1536d) |
Cloud |
voyageEmbeddingProvider |
Voyage AI API | voyage-3.5 |
Cloud |
bedrockEmbeddingProvider |
AWS Bedrock | amazon.titan-embed-text-v2:0 (1024d) |
Cloud (AWS) |
onnxEmbeddingProvider |
ONNX AllMiniLmL6V2 | AllMiniLmL6V2 (384d) | Local, in-process |
ollamaEmbeddingProvider |
Ollama server | nomic-embed-text |
Local server |
Each provider is a Provider<EmbeddingProvider> registered with @RegisterPlugin. Custom providers can be added by implementing the same interface.
Reranking Providers
Three out-of-the-box reranking providers, backed by LangChain4j's ScoringModel interface:
| Provider | Backend | Default Model | Type |
|---|---|---|---|
cohereRerankProvider |
Cohere API | rerank-v3.5 |
Cloud |
voyageRerankProvider |
Atlas Voyage AI API | rerank-2.5 |
Cloud (Atlas) |
onnxRerankProvider |
ONNX cross-encoder | MiniLM-L6-v2-cross-encoder | Local, in-process |
LangChain4j also supports additional reranking backends (Jina, Google Vertex AI, Xinference, watsonx.ai) that can be added as custom providers.
Configuration
plugins-args:
aiActivator:
embedding-provider: openAIEmbeddingProvider # plugin name of the chosen embedding provider
rerank-provider: cohereRerankProvider # optional, plugin name of the chosen reranking provider
atlas-api-key: "..." # optional, kept from Phase 1 for Atlas reranking
# --- Embedding providers (choose one) ---
openAIEmbeddingProvider:
api-key: "sk-proj-..."
model: text-embedding-3-small # optional, this is the default
# -- OR --
voyageEmbeddingProvider:
api-key: "pa-..."
model: voyage-3.5 # optional, this is the default
# -- OR --
bedrockEmbeddingProvider:
region: us-east-1 # optional, defaults to us-east-1
model: amazon.titan-embed-text-v2:0 # optional, this is the default
# uses default AWS credential chain (env vars, ~/.aws/credentials, IAM role, etc.)
# -- OR --
onnxEmbeddingProvider: {} # no configuration needed
# -- OR --
ollamaEmbeddingProvider:
base-url: "http://localhost:11434" # optional, this is the default
model: nomic-embed-text # optional, this is the default
# --- Reranking providers (choose one, optional) ---
cohereRerankProvider:
api-key: "..."
model: rerank-v3.5 # optional, this is the default
# -- OR --
voyageRerankProvider:
api-key: "..." # Atlas Model API key
model: rerank-2.5 # optional, this is the default
# -- OR --
onnxRerankProvider: {} # no configuration neededFeature 1: Custom Operator Registry in VarsInterpolator
A general-purpose extension mechanism in RESTHeart core: VarsInterpolator gains a static registry of custom operators (registerOperator(String name, Function<BsonValue, BsonValue>)). During the existing recursive BSON walk, after resolving $var, the interpolator checks for registered custom operators and delegates resolution to the registered function.
This is not AI-specific — it is a general extension point that any plugin can use to add custom BSON operators resolved during aggregation pipeline interpolation.
Feature 2: $vectorize Operator
The $vectorize operator converts text to an embedding vector inline in aggregation pipelines. The restheart-ai plugin registers $vectorize at startup via the custom operator registry — the core has no knowledge of embeddings or LangChain4j.
Example:
# Create a standard vector search index (not autoEmbed)
PUT /mydb/articles/_indexes/article_vectors
Content-Type: application/json
{ "type": "vectorSearch",
"definition": { "fields": [
{ "type": "vector", "path": "embedding", "numDimensions": 1536, "similarity": "cosine" }
]}}
# Define aggregation using $vectorize
PUT /mydb/articles
Content-Type: application/json
{ "aggrs": [{ "uri": "search", "type": "pipeline", "stages": [
{ "$vectorSearch": {
"index": "article_vectors",
"path": "embedding",
"queryVector": { "$vectorize": { "$var": "query" } },
"numCandidates": 100,
"limit": 10
}},
{ "$project": { "embedding": 0 } }
]}]}
# Execute — client passes text, RESTHeart generates the vector
GET /mydb/articles/_aggrs/search?avars={"query":"machine learning for NLP"}RESTHeart resolves $var first (producing the text string), then calls the registered $vectorize function (which invokes the configured embedding provider), and replaces { "$vectorize": ... } with a float array before sending the pipeline to MongoDB.
Feature 3: Auto-Embedding on Write
When a collection has vectorSearch metadata with textField and embeddingField, the AutoEmbeddingInterceptor automatically generates embeddings for documents on POST, PUT, and PATCH requests using the configured embedding provider.
Enable auto-embedding:
PATCH /mydb/articles
Content-Type: application/json
{ "vectorSearch": { "textField": "description", "embeddingField": "embedding" } }Write a document — the embedding is generated automatically:
POST /mydb/articles
Content-Type: application/json
{ "title": "My Article", "description": "An introduction to vector search in MongoDB" }The stored document will include an embedding field containing the float array generated by the configured provider.
Feature 4: Document Chunking with Embedding
Phase 1's document chunking stores text segments without embeddings (relying on autoEmbed indexes). Phase 2 extends the DocumentChunkingInterceptor to optionally generate embeddings for each chunk using the configured embedding provider.
When an embedding-provider is configured in aiActivator, the interceptor generates and stores embeddings alongside each text segment. When no embedding provider is configured, the behavior remains the same as Phase 1 (text-only segments, embeddings delegated to MongoDB autoEmbed).
Each text segment is stored as:
{
"_id": { "$oid": "..." },
"text": "chunk text content...",
"vector": [0.123, -0.456, 0.789, ...],
"metadata": {
"index": 0,
"fileId": { "$oid": "..." },
"filename": "article.pdf",
"tags": ["public", "tech"]
}
}Feature 5: Pluggable Reranking
Phase 1's reranking calls the Atlas Reranking API directly. Phase 2 makes reranking pluggable: when a rerank-provider is configured in aiActivator, the reranking interceptor uses the configured LangChain4j-backed provider instead of the Atlas API.
This enables reranking on any MongoDB deployment — not just Atlas — and allows choosing between different reranking backends (Cohere, ONNX local, etc.).
Backward Compatibility with Phase 1
Phase 2 is fully backward-compatible:
- Without
embedding-providerconfigured:$vectorizeis not registered, auto-embedding is disabled, chunking stores text-only segments → behaves exactly like Phase 1 - Without
rerank-providerconfigured: reranking uses the Atlas API viaatlas-api-key→ behaves exactly like Phase 1 autoEmbedindexes andqueryStringcontinue to work as before