|
| 1 | +--- |
| 2 | +title: Gemini |
| 3 | +sidebarTitle: Gemini |
| 4 | +icon: google |
| 5 | +--- |
| 6 | + |
| 7 | +Embed text and generate completions using Google's Gemini models. |
| 8 | +See the API reference for [Gemini UDFs](https://lancedb.github.io/geneva/api/gemini/) and |
| 9 | +[Embedding UDFs](https://lancedb.github.io/geneva/api/embeddings/) for all parameters. |
| 10 | + |
| 11 | +```python |
| 12 | +pip install 'geneva[udf-text-gemini]' |
| 13 | +``` |
| 14 | + |
| 15 | +<Warning> |
| 16 | +Gemini UDFs make API calls that incur **per-token costs**. Each row processed results in one |
| 17 | +or more API requests billed to your account. Review |
| 18 | +[Gemini pricing](https://ai.google.dev/gemini-api/docs/pricing) before running on large tables. |
| 19 | +</Warning> |
| 20 | + |
| 21 | +<Note> |
| 22 | +Set the `GEMINI_API_KEY` environment variable before calling any factory function below. |
| 23 | +The key is read **at UDF creation time** and serialized with the UDF — no cluster-level |
| 24 | +`env_vars` configuration is needed. |
| 25 | +</Note> |
| 26 | + |
| 27 | +## Embeddings |
| 28 | + |
| 29 | +Embed text with optional task-type hints for retrieval, classification, and clustering scenarios. |
| 30 | +See the [API reference](https://lancedb.github.io/geneva/api/embeddings/#geneva.udfs.text.embeddings.gemini_embedding_udf) for all parameters. |
| 31 | + |
| 32 | +**Multiple embeddings tuned for different retrieval tasks:** |
| 33 | + |
| 34 | +```python |
| 35 | +from geneva.udfs import gemini_embedding_udf |
| 36 | + |
| 37 | +table.add_columns({ |
| 38 | + # Full-dimension embedding for document retrieval |
| 39 | + "embedding_doc": gemini_embedding_udf( |
| 40 | + column="body", |
| 41 | + model="gemini-embedding-001", |
| 42 | + task_type="RETRIEVAL_DOCUMENT", |
| 43 | + ), |
| 44 | + # Compact embedding for semantic similarity |
| 45 | + "embedding_sim_256": gemini_embedding_udf( |
| 46 | + column="body", |
| 47 | + model="gemini-embedding-001", |
| 48 | + task_type="SEMANTIC_SIMILARITY", |
| 49 | + output_dimensionality=256, |
| 50 | + ), |
| 51 | +}) |
| 52 | +``` |
| 53 | + |
| 54 | +## Generation |
| 55 | + |
| 56 | +Generate text from Gemini models. Supports text, image, audio, video, and document inputs. |
| 57 | +See the [API reference](https://lancedb.github.io/geneva/api/gemini/#geneva.udfs.text.gemini.gemini_udf) for all parameters. |
| 58 | + |
| 59 | +**Enrich a table with sentiment, captions, and transcriptions at once:** |
| 60 | + |
| 61 | +```python |
| 62 | +from geneva.udfs import gemini_udf |
| 63 | + |
| 64 | +table.add_columns({ |
| 65 | + # Classify review sentiment with a fast model |
| 66 | + "sentiment": gemini_udf( |
| 67 | + column="review", |
| 68 | + prompt="Classify the sentiment as positive, negative, or neutral. Return only the label.", |
| 69 | + model="gemini-2.5-flash", |
| 70 | + ), |
| 71 | + # Caption product images with a more capable model |
| 72 | + "caption": gemini_udf( |
| 73 | + column="image", |
| 74 | + prompt="Describe the main subject of this image in one sentence", |
| 75 | + model="gemini-2.5-pro", |
| 76 | + mime_type="image/jpeg", |
| 77 | + ), |
| 78 | + # Transcribe audio clips |
| 79 | + "transcript": gemini_udf( |
| 80 | + column="audio", |
| 81 | + prompt="Transcribe this audio clip", |
| 82 | + model="gemini-2.5-flash", |
| 83 | + mime_type="audio/mp3", |
| 84 | + ), |
| 85 | +}) |
| 86 | +``` |
0 commit comments