restheart-ai — pluggable embedding and reranking providers via LangChain4j

**Brief overview**

Extend the `restheart-ai` plugin (Phase 1) with pluggable embedding and reranking providers backed by LangChain4j, enabling vector search workflows on any MongoDB instance — without relying on MongoDB's `autoEmbed` or the Atlas Reranking API.

**Rationale**

Phase 1 of `restheart-ai` leverages MongoDB's native `autoEmbed` for embedding generation and the Atlas Reranking API for reranking. This works well but has limitations:

- `autoEmbed` uses only Voyage AI models — no choice of embedding model or provider
- The Atlas Reranking API is available on Atlas only
- Some deployments need local/on-premise embedding (e.g., for data privacy or air-gapped environments)
- Some use cases require specific embedding models (e.g., domain-specific fine-tuned models)

Phase 2 adds pluggable providers that let RESTHeart generate embeddings and perform reranking using any LangChain4j-supported backend, while remaining fully backward-compatible with Phase 1.

**Detailed documentation**

## Phase 2: Pluggable Providers

### What Phase 2 adds on top of Phase 1

| Feature | Phase 1 | Phase 2 |
|---------|---------|---------|
| Embedding generation | MongoDB `autoEmbed` | Pluggable: OpenAI, Voyage AI, AWS Bedrock, ONNX, Ollama, or custom |
| Query vectorization | MongoDB `queryString` | `$vectorize` operator (text → vector inline in pipelines) |
| Auto-embedding on write | MongoDB `autoEmbed` | `AutoEmbeddingInterceptor` (RESTHeart generates embeddings on POST/PUT/PATCH) |
| Document chunking | Text extraction + storage (MongoDB embeds via `autoEmbed`) | Text extraction + embedding generation + storage (RESTHeart embeds each chunk) |
| Reranking | Atlas Reranking API only | Pluggable: Cohere, Voyage AI (Atlas), ONNX, or custom |

### Embedding Providers

Five out-of-the-box embedding providers, all backed by LangChain4j:

| Provider | Backend | Default Model | Type |
|----------|---------|---------------|------|
| `openAIEmbeddingProvider` | OpenAI API | `text-embedding-3-small` (1536d) | Cloud |
| `voyageEmbeddingProvider` | Voyage AI API | `voyage-3.5` | Cloud |
| `bedrockEmbeddingProvider` | AWS Bedrock | `amazon.titan-embed-text-v2:0` (1024d) | Cloud (AWS) |
| `onnxEmbeddingProvider` | ONNX AllMiniLmL6V2 | AllMiniLmL6V2 (384d) | Local, in-process |
| `ollamaEmbeddingProvider` | Ollama server | `nomic-embed-text` | Local server |

Each provider is a `Provider<EmbeddingProvider>` registered with `@RegisterPlugin`. Custom providers can be added by implementing the same interface.

### Reranking Providers

Three out-of-the-box reranking providers, backed by LangChain4j's `ScoringModel` interface:

| Provider | Backend | Default Model | Type |
|----------|---------|---------------|------|
| `cohereRerankProvider` | Cohere API | `rerank-v3.5` | Cloud |
| `voyageRerankProvider` | Atlas Voyage AI API | `rerank-2.5` | Cloud (Atlas) |
| `onnxRerankProvider` | ONNX cross-encoder | MiniLM-L6-v2-cross-encoder | Local, in-process |

LangChain4j also supports additional reranking backends (Jina, Google Vertex AI, Xinference, watsonx.ai) that can be added as custom providers.

### Configuration

```yaml
plugins-args:
  aiActivator:
    embedding-provider: openAIEmbeddingProvider     # plugin name of the chosen embedding provider
    rerank-provider: cohereRerankProvider            # optional, plugin name of the chosen reranking provider
    atlas-api-key: "..."                             # optional, kept from Phase 1 for Atlas reranking

  # --- Embedding providers (choose one) ---

  openAIEmbeddingProvider:
    api-key: "sk-proj-..."
    model: text-embedding-3-small                   # optional, this is the default

  # -- OR --
  voyageEmbeddingProvider:
    api-key: "pa-..."
    model: voyage-3.5                               # optional, this is the default

  # -- OR --
  bedrockEmbeddingProvider:
    region: us-east-1                                 # optional, defaults to us-east-1
    model: amazon.titan-embed-text-v2:0               # optional, this is the default
    # uses default AWS credential chain (env vars, ~/.aws/credentials, IAM role, etc.)

  # -- OR --
  onnxEmbeddingProvider: {}                         # no configuration needed

  # -- OR --
  ollamaEmbeddingProvider:
    base-url: "http://localhost:11434"              # optional, this is the default
    model: nomic-embed-text                         # optional, this is the default

  # --- Reranking providers (choose one, optional) ---

  cohereRerankProvider:
    api-key: "..."
    model: rerank-v3.5                              # optional, this is the default

  # -- OR --
  voyageRerankProvider:
    api-key: "..."                                  # Atlas Model API key
    model: rerank-2.5                               # optional, this is the default

  # -- OR --
  onnxRerankProvider: {}                            # no configuration needed
```

### Feature 1: Custom Operator Registry in `VarsInterpolator`

A general-purpose extension mechanism in RESTHeart core: `VarsInterpolator` gains a static registry of custom operators (`registerOperator(String name, Function<BsonValue, BsonValue>)`). During the existing recursive BSON walk, after resolving `$var`, the interpolator checks for registered custom operators and delegates resolution to the registered function.

This is not AI-specific — it is a general extension point that any plugin can use to add custom BSON operators resolved during aggregation pipeline interpolation.

### Feature 2: `$vectorize` Operator

The `$vectorize` operator converts text to an embedding vector inline in aggregation pipelines. The `restheart-ai` plugin registers `$vectorize` at startup via the custom operator registry — the core has no knowledge of embeddings or LangChain4j.

**Example:**

```bash
# Create a standard vector search index (not autoEmbed)
PUT /mydb/articles/_indexes/article_vectors
Content-Type: application/json

{ "type": "vectorSearch",
  "definition": { "fields": [
    { "type": "vector", "path": "embedding", "numDimensions": 1536, "similarity": "cosine" }
  ]}}

# Define aggregation using $vectorize
PUT /mydb/articles
Content-Type: application/json

{ "aggrs": [{ "uri": "search", "type": "pipeline", "stages": [
    { "$vectorSearch": {
        "index": "article_vectors",
        "path": "embedding",
        "queryVector": { "$vectorize": { "$var": "query" } },
        "numCandidates": 100,
        "limit": 10
    }},
    { "$project": { "embedding": 0 } }
]}]}

# Execute — client passes text, RESTHeart generates the vector
GET /mydb/articles/_aggrs/search?avars={"query":"machine learning for NLP"}
```

RESTHeart resolves `$var` first (producing the text string), then calls the registered `$vectorize` function (which invokes the configured embedding provider), and replaces `{ "$vectorize": ... }` with a float array before sending the pipeline to MongoDB.

### Feature 3: Auto-Embedding on Write

When a collection has `vectorSearch` metadata with `textField` and `embeddingField`, the `AutoEmbeddingInterceptor` automatically generates embeddings for documents on POST, PUT, and PATCH requests using the configured embedding provider.

**Enable auto-embedding:**

```bash
PATCH /mydb/articles
Content-Type: application/json

{ "vectorSearch": { "textField": "description", "embeddingField": "embedding" } }
```

**Write a document — the embedding is generated automatically:**

```bash
POST /mydb/articles
Content-Type: application/json

{ "title": "My Article", "description": "An introduction to vector search in MongoDB" }
```

The stored document will include an `embedding` field containing the float array generated by the configured provider.

### Feature 4: Document Chunking with Embedding

Phase 1's document chunking stores text segments without embeddings (relying on `autoEmbed` indexes). Phase 2 extends the `DocumentChunkingInterceptor` to optionally generate embeddings for each chunk using the configured embedding provider.

When an `embedding-provider` is configured in `aiActivator`, the interceptor generates and stores embeddings alongside each text segment. When no embedding provider is configured, the behavior remains the same as Phase 1 (text-only segments, embeddings delegated to MongoDB `autoEmbed`).

Each text segment is stored as:

```json
{
  "_id": { "$oid": "..." },
  "text": "chunk text content...",
  "vector": [0.123, -0.456, 0.789, ...],
  "metadata": {
    "index": 0,
    "fileId": { "$oid": "..." },
    "filename": "article.pdf",
    "tags": ["public", "tech"]
  }
}
```

### Feature 5: Pluggable Reranking

Phase 1's reranking calls the Atlas Reranking API directly. Phase 2 makes reranking pluggable: when a `rerank-provider` is configured in `aiActivator`, the reranking interceptor uses the configured LangChain4j-backed provider instead of the Atlas API.

This enables reranking on any MongoDB deployment — not just Atlas — and allows choosing between different reranking backends (Cohere, ONNX local, etc.).

### Backward Compatibility with Phase 1

Phase 2 is fully backward-compatible:

- Without `embedding-provider` configured: `$vectorize` is not registered, auto-embedding is disabled, chunking stores text-only segments → behaves exactly like Phase 1
- Without `rerank-provider` configured: reranking uses the Atlas API via `atlas-api-key` → behaves exactly like Phase 1
- `autoEmbed` indexes and `queryString` continue to work as before


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

restheart-ai — pluggable embedding and reranking providers via LangChain4j #591

Phase 2: Pluggable Providers

What Phase 2 adds on top of Phase 1

Embedding Providers

Reranking Providers

Configuration

Feature 1: Custom Operator Registry in `VarsInterpolator`

Feature 2: `$vectorize` Operator

Feature 3: Auto-Embedding on Write

Feature 4: Document Chunking with Embedding

Feature 5: Pluggable Reranking

Backward Compatibility with Phase 1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature	Phase 1	Phase 2
Embedding generation	MongoDB `autoEmbed`	Pluggable: OpenAI, Voyage AI, AWS Bedrock, ONNX, Ollama, or custom
Query vectorization	MongoDB `queryString`	`$vectorize` operator (text → vector inline in pipelines)
Auto-embedding on write	MongoDB `autoEmbed`	`AutoEmbeddingInterceptor` (RESTHeart generates embeddings on POST/PUT/PATCH)
Document chunking	Text extraction + storage (MongoDB embeds via `autoEmbed`)	Text extraction + embedding generation + storage (RESTHeart embeds each chunk)
Reranking	Atlas Reranking API only	Pluggable: Cohere, Voyage AI (Atlas), ONNX, or custom

Provider	Backend	Default Model	Type
`openAIEmbeddingProvider`	OpenAI API	`text-embedding-3-small` (1536d)	Cloud
`voyageEmbeddingProvider`	Voyage AI API	`voyage-3.5`	Cloud
`bedrockEmbeddingProvider`	AWS Bedrock	`amazon.titan-embed-text-v2:0` (1024d)	Cloud (AWS)
`onnxEmbeddingProvider`	ONNX AllMiniLmL6V2	AllMiniLmL6V2 (384d)	Local, in-process
`ollamaEmbeddingProvider`	Ollama server	`nomic-embed-text`	Local server

Provider	Backend	Default Model	Type
`cohereRerankProvider`	Cohere API	`rerank-v3.5`	Cloud
`voyageRerankProvider`	Atlas Voyage AI API	`rerank-2.5`	Cloud (Atlas)
`onnxRerankProvider`	ONNX cross-encoder	MiniLM-L6-v2-cross-encoder	Local, in-process

Uh oh!

restheart-ai — pluggable embedding and reranking providers via LangChain4j #591

Description

Phase 2: Pluggable Providers

What Phase 2 adds on top of Phase 1

Embedding Providers

Reranking Providers

Configuration

Feature 1: Custom Operator Registry in VarsInterpolator

Feature 2: $vectorize Operator

Feature 3: Auto-Embedding on Write

Feature 4: Document Chunking with Embedding

Feature 5: Pluggable Reranking

Backward Compatibility with Phase 1

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Feature 1: Custom Operator Registry in `VarsInterpolator`

Feature 2: `$vectorize` Operator