| title | summary | aliases | |
|---|---|---|---|
OpenAI Embeddings |
Learn how to use OpenAI embedding models in TiDB Cloud. |
|
This document describes how to use OpenAI embedding models with Auto Embedding in TiDB Cloud to perform semantic searches from text queries.
Note:
Auto Embedding is only available on {{{ .starter }}} clusters hosted on AWS.
All OpenAI models are available for use with the openai/ prefix if you bring your own OpenAI API key (BYOK). For example:
text-embedding-3-small
- Name:
openai/text-embedding-3-small - Dimensions: 512-1536 (default: 1536)
- Distance metric: Cosine, L2
- Price: Charged by OpenAI
- Hosted by TiDB Cloud: ❌
- Bring Your Own Key: ✅
text-embedding-3-large
- Name:
openai/text-embedding-3-large - Dimensions: 256-3072 (default: 3072)
- Distance metric: Cosine, L2
- Price: Charged by OpenAI
- Hosted by TiDB Cloud: ❌
- Bring Your Own Key: ✅
For a full list of available models, see OpenAI Documentation.
To use OpenAI models, you must specify an OpenAI API key as follows:
Note:
Replace
'your-openai-api-key-here'with your actual OpenAI API key.
SET @@GLOBAL.TIDB_EXP_EMBED_OPENAI_API_KEY = 'your-openai-api-key-here';
CREATE TABLE sample (
`id` INT,
`content` TEXT,
`embedding` VECTOR(1536) GENERATED ALWAYS AS (EMBED_TEXT(
"openai/text-embedding-3-small",
`content`
)) STORED
);
INSERT INTO sample
(`id`, `content`)
VALUES
(1, "Java: Object-oriented language for cross-platform development."),
(2, "Java coffee: Bold Indonesian beans with low acidity."),
(3, "Java island: Densely populated, home to Jakarta."),
(4, "Java's syntax is used in Android apps."),
(5, "Dark roast Java beans enhance espresso blends.");
SELECT `id`, `content` FROM sample
ORDER BY
VEC_EMBED_COSINE_DISTANCE(
embedding,
"How to start learning Java programming?"
)
LIMIT 2;Result:
+------+----------------------------------------------------------------+
| id | content |
+------+----------------------------------------------------------------+
| 1 | Java: Object-oriented language for cross-platform development. |
| 4 | Java's syntax is used in Android apps. |
+------+----------------------------------------------------------------+
All OpenAI embedding options are supported via the additional_json_options parameter of the EMBED_TEXT() function.
Example: Use an alternative dimension for text-embedding-3-large
CREATE TABLE sample (
`id` INT,
`content` TEXT,
`embedding` VECTOR(1024) GENERATED ALWAYS AS (EMBED_TEXT(
"openai/text-embedding-3-large",
`content`,
'{"dimensions": 1024}'
)) STORED
);For all available options, see OpenAI Documentation.
See PyTiDB Documentation.