docs: add built-in UDF provider pages for OpenAI, Gemini, Sentence Transformers

jmhsieh · claude · jmhsieh · commit 61d73761c5b5 · 2026-03-02T10:37:17.000-08:00
Add dedicated documentation pages for each built-in UDF provider with
usage examples and GPU acceleration guidance. Clarify that fractional
num_gpus is a Ray scheduling mechanism, not GPU memory partitioning.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/docs/docs.json b/docs/docs.json
@@ -156,6 +156,15 @@
                 "group": "User Defined Functions (UDFs)",
                 "pages": [
                   "geneva/udfs/index",
+                  {
+                    "group": "Built-in Providers",
+                    "pages": [
+                      "geneva/udfs/providers/index",
+                      "geneva/udfs/providers/openai",
+                      "geneva/udfs/providers/gemini",
+                      "geneva/udfs/providers/sentence-transformers"
+                    ]
+                  },
                   "geneva/udfs/blobs",
                   "geneva/udfs/error_handling",
                   "geneva/udfs/advanced-configuration"
@@ -383,6 +392,10 @@
     {
       "source": "/tutorials/vector-search/:slug*",
       "destination": "tutorials/search/:slug*"
+    },
+    {
+      "source": "/geneva/udfs/built-in",
+      "destination": "/geneva/udfs/providers"
     }
   ]
 }
diff --git a/docs/geneva/udfs/index.mdx b/docs/geneva/udfs/index.mdx
@@ -105,6 +105,10 @@ class OpenAIEmbedding(Callable):
         return pa.array(resp.data[0].embeddings)
 ```
 
+<Tip>
+For common providers like OpenAI and Gemini, Geneva ships [built-in UDFs](/geneva/udfs/providers) that handle API keys, retries, and batching for you — no custom class needed.
+</Tip>
+
 > **Note**:  The state is will be independently managed on each distributed Worker.
 
 ## UDF options
diff --git a/docs/geneva/udfs/providers/gemini.mdx b/docs/geneva/udfs/providers/gemini.mdx
@@ -0,0 +1,86 @@
+---
+title: Gemini
+sidebarTitle: Gemini
+icon: google
+---
+
+Embed text and generate completions using Google's Gemini models.
+See the API reference for [Gemini UDFs](https://lancedb.github.io/geneva/api/gemini/) and
+[Embedding UDFs](https://lancedb.github.io/geneva/api/embeddings/) for all parameters.
+
+```python
+pip install 'geneva[udf-text-gemini]'
+```
+
+<Warning>
+Gemini UDFs make API calls that incur **per-token costs**. Each row processed results in one
+or more API requests billed to your account. Review
+[Gemini pricing](https://ai.google.dev/gemini-api/docs/pricing) before running on large tables.
+</Warning>
+
+<Note>
+Set the `GEMINI_API_KEY` environment variable before calling any factory function below.
+The key is read **at UDF creation time** and serialized with the UDF — no cluster-level
+`env_vars` configuration is needed.
+</Note>
+
+## Embeddings
+
+Embed text with optional task-type hints for retrieval, classification, and clustering scenarios.
+See the [API reference](https://lancedb.github.io/geneva/api/embeddings/#geneva.udfs.text.embeddings.gemini_embedding_udf) for all parameters.
+
+**Multiple embeddings tuned for different retrieval tasks:**
+
+```python
+from geneva.udfs import gemini_embedding_udf
+
+table.add_columns({
+    # Full-dimension embedding for document retrieval
+    "embedding_doc": gemini_embedding_udf(
+        column="body",
+        model="gemini-embedding-001",
+        task_type="RETRIEVAL_DOCUMENT",
+    ),
+    # Compact embedding for semantic similarity
+    "embedding_sim_256": gemini_embedding_udf(
+        column="body",
+        model="gemini-embedding-001",
+        task_type="SEMANTIC_SIMILARITY",
+        output_dimensionality=256,
+    ),
+})
+```
+
+## Generation
+
+Generate text from Gemini models. Supports text, image, audio, video, and document inputs.
+See the [API reference](https://lancedb.github.io/geneva/api/gemini/#geneva.udfs.text.gemini.gemini_udf) for all parameters.
+
+**Enrich a table with sentiment, captions, and transcriptions at once:**
+
+```python
+from geneva.udfs import gemini_udf
+
+table.add_columns({
+    # Classify review sentiment with a fast model
+    "sentiment": gemini_udf(
+        column="review",
+        prompt="Classify the sentiment as positive, negative, or neutral. Return only the label.",
+        model="gemini-2.5-flash",
+    ),
+    # Caption product images with a more capable model
+    "caption": gemini_udf(
+        column="image",
+        prompt="Describe the main subject of this image in one sentence",
+        model="gemini-2.5-pro",
+        mime_type="image/jpeg",
+    ),
+    # Transcribe audio clips
+    "transcript": gemini_udf(
+        column="audio",
+        prompt="Transcribe this audio clip",
+        model="gemini-2.5-flash",
+        mime_type="audio/mp3",
+    ),
+})
+```
diff --git a/docs/geneva/udfs/providers/index.mdx b/docs/geneva/udfs/providers/index.mdx
@@ -0,0 +1,66 @@
+---
+title: Built-in LLM and Embedding UDFs
+sidebarTitle: Overview
+icon: sparkles
+---
+
+Geneva ships pre-built UDFs for common LLM providers so you don't have to write custom classes
+for everyday embedding and generation tasks.
+
+| Provider | Embeddings | Generation | Runs locally | Install extra |
+|----------|:----------:|:----------:|:------------:|---------------|
+| [OpenAI](/geneva/udfs/providers/openai) | ✓ | ✓ | — | `geneva[udf-text-openai]` |
+| [Gemini](/geneva/udfs/providers/gemini) | ✓ | ✓ | — | `geneva[udf-text-gemini]` |
+| [Sentence Transformers](/geneva/udfs/providers/sentence-transformers) | ✓ | — | ✓ | `geneva[udf-text-sentence-transformers]` |
+
+OpenAI and Gemini UDFs make remote API calls that incur per-token costs.
+Sentence Transformers run locally on your workers with no API costs — see
+[GPU acceleration](/geneva/udfs/providers/sentence-transformers#gpu-acceleration)
+for performance tips.
+
+## Comparing models and prompts
+
+Because `add_columns` accepts a dictionary, you can evaluate multiple models, parameter
+settings, or prompts in a single pass over your data. Each entry produces its own column,
+so results sit side by side in the same table for easy comparison.
+
+```python
+from geneva.udfs import openai_udf, gemini_udf, openai_embedding_udf
+
+table.add_columns({
+    # Compare two embedding models
+    "emb_small": openai_embedding_udf(column="body", model="text-embedding-3-small"),
+    "emb_large": openai_embedding_udf(column="body", model="text-embedding-3-large"),
+
+    # Compare the same task across providers
+    "summary_openai": openai_udf(
+        column="body",
+        prompt="Summarize in one sentence",
+        model="gpt-5-mini",
+    ),
+    "summary_gemini": gemini_udf(
+        column="body",
+        prompt="Summarize in one sentence",
+        model="gemini-2.5-flash",
+    ),
+})
+```
+
+This works for any combination — different models from the same provider, different providers,
+different prompts with the same model, or different dimensionality settings. All columns are
+computed in parallel during the same backfill job.
+
+## What's included
+
+All built-in UDFs share these capabilities:
+
+- **API key handling** — Keys are captured from your local environment at UDF creation time and serialized with the UDF. No cluster-level environment configuration required.
+- **Retry with backoff** — Transient API errors (rate limits, timeouts, server errors) are automatically retried with exponential backoff.
+- **Batch processing** — Embedding UDFs batch multiple rows per API call for better throughput.
+- **L2 normalization** — Embedding UDFs support optional L2 normalization via the `normalize` parameter (disabled by default since both providers return pre-normalized vectors).
+
+## See also
+
+- [Working with UDFs](/geneva/udfs/index) — Write custom scalar, batched, and stateful UDFs
+- [Error handling](/geneva/udfs/error_handling) — Fine-grained retry and skip policies
+- [Working with blobs](/geneva/udfs/blobs) — Process binary data (images, audio, video)
diff --git a/docs/geneva/udfs/providers/openai.mdx b/docs/geneva/udfs/providers/openai.mdx
@@ -0,0 +1,79 @@
+---
+title: OpenAI
+sidebarTitle: OpenAI
+icon: brain
+---
+
+Embed text and generate completions using OpenAI models.
+See the [API reference](https://lancedb.github.io/geneva/api/openai/) for all parameters.
+
+```python
+pip install 'geneva[udf-text-openai]'
+```
+
+<Warning>
+OpenAI UDFs make API calls that incur **per-token costs**. Each row processed results in one
+or more API requests billed to your account. Review
+[OpenAI pricing](https://openai.com/api/pricing/) before running on large tables.
+</Warning>
+
+<Note>
+Set the `OPENAI_API_KEY` environment variable before calling any factory function below.
+The key is read **at UDF creation time** and serialized with the UDF — no cluster-level
+`env_vars` configuration is needed.
+</Note>
+
+## Embeddings
+
+**Compare models by adding multiple embedding columns at once:**
+
+```python
+from geneva.udfs import openai_embedding_udf
+
+table.add_columns({
+    # Default model — fast, 1536 dimensions
+    "embedding_small": openai_embedding_udf(
+        column="body",
+        model="text-embedding-3-small",
+    ),
+    # Higher-quality model — 3072 dimensions
+    "embedding_large": openai_embedding_udf(
+        column="body",
+        model="text-embedding-3-large",
+    ),
+    # Same large model, truncated to 256 dimensions for storage efficiency
+    "embedding_large_256": openai_embedding_udf(
+        column="body",
+        model="text-embedding-3-large",
+        output_dimensionality=256,
+    ),
+})
+```
+
+## Generation
+
+Generate text from OpenAI chat completion models. Supports both text and binary (image)
+input columns.
+See the [API reference](https://lancedb.github.io/geneva/api/openai/#geneva.udfs.openai.openai_udf) for all parameters.
+
+**Add a summary and an image caption in one call, using different models:**
+
+```python
+from geneva.udfs import openai_udf
+
+table.add_columns({
+    # Fast model for bulk text summarization
+    "summary": openai_udf(
+        column="body",
+        prompt="Summarize this document in 3 bullet points",
+        model="gpt-5-mini",
+    ),
+    # More capable model for nuanced image captions
+    "caption": openai_udf(
+        column="image",
+        prompt="Provide a 1 sentence description of the scene",
+        model="gpt-5",
+        mime_type="image/jpeg",
+    ),
+})
+```
diff --git a/docs/geneva/udfs/providers/sentence-transformers.mdx b/docs/geneva/udfs/providers/sentence-transformers.mdx
@@ -0,0 +1,60 @@
+---
+title: Sentence Transformers
+sidebarTitle: Sentence Transformers
+icon: microchip
+---
+
+Embed text using any HuggingFace Sentence Transformer model locally — no API key needed.
+See the [API reference](https://lancedb.github.io/geneva/api/embeddings/#geneva.udfs.text.embeddings.sentence_transformer_udf) for all parameters.
+
+```python
+pip install 'geneva[udf-text-sentence-transformers]'
+```
+
+<Tip>
+Sentence Transformer models run **locally** on your workers — there are no API calls and no
+per-token costs. This makes them a good fit for large-scale embedding jobs where cost is a
+concern.
+</Tip>
+
+## Embeddings
+
+**Compare a lightweight and a high-quality model side by side:**
+
+```python
+from geneva.udfs import sentence_transformer_udf
+
+table.add_columns({
+    # Lightweight default model — fast, CPU-friendly
+    "embedding_mini": sentence_transformer_udf(
+        column="body",
+        model="sentence-transformers/all-MiniLM-L6-v2",
+    ),
+    # Larger model with GPU acceleration
+    "embedding_bge": sentence_transformer_udf(
+        column="body",
+        model="BAAI/bge-large-en-v1.5",
+        num_gpus=1.0,
+    ),
+})
+```
+
+## GPU acceleration
+
+Sentence Transformer models can run on CPU or GPU. Smaller models like `all-MiniLM-L6-v2`
+work well on CPU, but larger models like `bge-large-en-v1.5` benefit significantly from GPU
+acceleration. Use the `num_gpus` parameter to request GPU resources for a worker:
+
+```python
+# CPU-only (default) — suitable for lightweight models
+sentence_transformer_udf(column="body", model="sentence-transformers/all-MiniLM-L6-v2")
+
+# GPU-accelerated — recommended for larger models
+sentence_transformer_udf(column="body", model="BAAI/bge-large-en-v1.5", num_gpus=1.0)
+```
+
+Setting `num_gpus` to a fractional value (e.g., `0.5`) tells the
+[Ray scheduler](https://docs.ray.io/en/latest/ray-core/scheduling/accelerators.html)
+to co-locate multiple workers on the same physical GPU. For example, two UDFs with
+`num_gpus=0.5` will be scheduled on a single GPU. Note that Ray does not enforce GPU memory
+limits — it is your responsibility to ensure the combined models fit in GPU memory.