Skip to content

Commit 9b0f283

Browse files
committed
update cohere embed model doc
1 parent 8141369 commit 9b0f283

File tree

2 files changed

+54
-29
lines changed

2 files changed

+54
-29
lines changed

articles/ai-studio/how-to/deploy-models-cohere-embed.md

Lines changed: 29 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: Learn how to use Cohere Embed V3 models with Azure AI Studio.
55
ms.service: azure-ai-studio
66
manager: scottpolly
77
ms.topic: how-to
8-
ms.date: 08/08/2024
8+
ms.date: 10/23/2024
99
ms.reviewer: shubhiraj
1010
reviewer: shubhirajMsft
1111
ms.author: mopeakande
@@ -16,12 +16,12 @@ zone_pivot_groups: azure-ai-model-catalog-samples-embeddings
1616

1717
# How to use Cohere Embed V3 models with Azure AI Studio
1818

19-
[!INCLUDE [feature-preview](../includes/feature-preview.md)]
19+
[!INCLUDE [Feature preview](~/reusable-content/ce-skilling/azure/includes/ai-studio/includes/feature-preview.md)]
2020

2121
In this article, you learn about Cohere Embed V3 models and how to use them with Azure AI Studio.
2222
The Cohere family of models includes various models optimized for different use cases, including chat completions, embeddings, and rerank. Cohere models are optimized for various use cases that include reasoning, summarization, and question answering.
2323

24-
[!INCLUDE [models-preview](../includes/models-preview.md)]
24+
2525

2626
::: zone pivot="programming-language-python"
2727

@@ -31,19 +31,24 @@ The Cohere family of models for embeddings includes the following models:
3131

3232
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
3333

34-
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
34+
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
3535

36-
* Embed English has 1,024 dimensions.
36+
* Embed English has 1,024 dimensions
3737
* Context window of the model is 512 tokens
38+
* Embed English accepts images as a base64 encoded data url
3839

40+
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
41+
3942

4043
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
4144

42-
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
45+
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
4346

44-
* Embed Multilingual has 1,024 dimensions.
47+
* Embed Multilingual has 1,024 dimensions
4548
* Context window of the model is 512 tokens
49+
* Embed Multilingual accepts images as a base64 encoded data url
4650

51+
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
4752

4853
---
4954

@@ -220,19 +225,23 @@ The Cohere family of models for embeddings includes the following models:
220225

221226
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
222227

223-
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
228+
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
224229

225-
* Embed English has 1,024 dimensions.
230+
* Embed English has 1,024 dimensions
226231
* Context window of the model is 512 tokens
232+
* Embed English accepts images as a base64 encoded data url
227233

234+
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
228235

229236
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
230237

231-
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
238+
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
232239

233-
* Embed Multilingual has 1,024 dimensions.
240+
* Embed Multilingual has 1,024 dimensions
234241
* Context window of the model is 512 tokens
242+
* Embed Multilingual accepts images as a base64 encoded data url
235243

244+
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
236245

237246
---
238247

@@ -411,19 +420,23 @@ The Cohere family of models for embeddings includes the following models:
411420

412421
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
413422

414-
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
423+
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
415424

416-
* Embed English has 1,024 dimensions.
425+
* Embed English has 1,024 dimensions
417426
* Context window of the model is 512 tokens
427+
* Embed English accepts images as a base64 encoded data url
418428

429+
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
419430

420431
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
421432

422-
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
433+
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
423434

424-
* Embed Multilingual has 1,024 dimensions.
435+
* Embed Multilingual has 1,024 dimensions
425436
* Context window of the model is 512 tokens
437+
* Embed Multilingual accepts images as a base64 encoded data url
426438

439+
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
427440

428441
---
429442

@@ -653,4 +666,4 @@ Quota is managed per deployment. Each deployment has a rate limit of 200,000 tok
653666
* [Deploy models as serverless APIs](deploy-models-serverless.md)
654667
* [Consume serverless API endpoints from a different Azure AI Studio project or hub](deploy-models-serverless-connect.md)
655668
* [Region availability for models in serverless API endpoints](deploy-models-serverless-availability.md)
656-
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
669+
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)

0 commit comments

Comments
 (0)