Skip to content

Commit 24cb53e

Browse files
authored
Revert "Cohere Embed - model updates"
1 parent dcba59d commit 24cb53e

File tree

2 files changed

+27
-52
lines changed

2 files changed

+27
-52
lines changed

articles/ai-studio/how-to/deploy-models-cohere-embed.md

Lines changed: 14 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: Learn how to use Cohere Embed V3 models with Azure AI Studio.
55
ms.service: azure-ai-studio
66
manager: scottpolly
77
ms.topic: how-to
8-
ms.date: 10/22/2024
8+
ms.date: 08/08/2024
99
ms.reviewer: shubhiraj
1010
reviewer: shubhirajMsft
1111
ms.author: mopeakande
@@ -31,24 +31,19 @@ The Cohere family of models for embeddings includes the following models:
3131

3232
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
3333

34-
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
34+
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
3535

36-
* Embed English has 1,024 dimensions
36+
* Embed English has 1,024 dimensions.
3737
* Context window of the model is 512 tokens
38-
* Embed English accepts images as a base64 encoded data url
3938

40-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
41-
4239

4340
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
4441

45-
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
42+
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
4643

47-
* Embed Multilingual has 1,024 dimensions
44+
* Embed Multilingual has 1,024 dimensions.
4845
* Context window of the model is 512 tokens
49-
* Embed Multilingual accepts images as a base64 encoded data url
5046

51-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
5247

5348
---
5449

@@ -225,23 +220,19 @@ The Cohere family of models for embeddings includes the following models:
225220

226221
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
227222

228-
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
223+
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
229224

230-
* Embed English has 1,024 dimensions
225+
* Embed English has 1,024 dimensions.
231226
* Context window of the model is 512 tokens
232-
* Embed English accepts images as a base64 encoded data url
233227

234-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
235228

236229
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
237230

238-
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
231+
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
239232

240-
* Embed Multilingual has 1,024 dimensions
233+
* Embed Multilingual has 1,024 dimensions.
241234
* Context window of the model is 512 tokens
242-
* Embed Multilingual accepts images as a base64 encoded data url
243235

244-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
245236

246237
---
247238

@@ -420,23 +411,19 @@ The Cohere family of models for embeddings includes the following models:
420411

421412
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
422413

423-
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
414+
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
424415

425-
* Embed English has 1,024 dimensions
416+
* Embed English has 1,024 dimensions.
426417
* Context window of the model is 512 tokens
427-
* Embed English accepts images as a base64 encoded data url
428418

429-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
430419

431420
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
432421

433-
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
422+
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
434423

435-
* Embed Multilingual has 1,024 dimensions
424+
* Embed Multilingual has 1,024 dimensions.
436425
* Context window of the model is 512 tokens
437-
* Embed Multilingual accepts images as a base64 encoded data url
438426

439-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
440427

441428
---
442429

@@ -666,4 +653,4 @@ Quota is managed per deployment. Each deployment has a rate limit of 200,000 tok
666653
* [Deploy models as serverless APIs](deploy-models-serverless.md)
667654
* [Consume serverless API endpoints from a different Azure AI Studio project or hub](deploy-models-serverless-connect.md)
668655
* [Region availability for models in serverless API endpoints](deploy-models-serverless-availability.md)
669-
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)
656+
* [Plan and manage costs (marketplace)](costs-plan-manage.md#monitor-costs-for-models-offered-through-the-azure-marketplace)

articles/machine-learning/how-to-deploy-models-cohere-embed.md

Lines changed: 13 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ manager: scottpolly
66
ms.service: azure-machine-learning
77
ms.subservice: inferencing
88
ms.topic: how-to
9-
ms.date: 10/22/2024
9+
ms.date: 09/24/2024
1010
ms.reviewer: shubhiraj
1111
reviewer: shubhirajMsft
1212
ms.author: mopeakande
@@ -31,23 +31,19 @@ The Cohere family of models for embeddings includes the following models:
3131

3232
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
3333

34-
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
34+
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
3535

36-
* Embed English has 1,024 dimensions
36+
* Embed English has 1,024 dimensions.
3737
* Context window of the model is 512 tokens
38-
* Embed English accepts images as a base64 encoded data url
3938

40-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
4139

4240
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
4341

44-
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
42+
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
4543

46-
* Embed Multilingual has 1,024 dimensions
44+
* Embed Multilingual has 1,024 dimensions.
4745
* Context window of the model is 512 tokens
48-
* Embed Multilingual accepts images as a base64 encoded data url
4946

50-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
5147

5248
---
5349

@@ -224,23 +220,19 @@ The Cohere family of models for embeddings includes the following models:
224220

225221
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
226222

227-
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
223+
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
228224

229-
* Embed English has 1,024 dimensions
225+
* Embed English has 1,024 dimensions.
230226
* Context window of the model is 512 tokens
231-
* Embed English accepts images as a base64 encoded data url
232227

233-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
234228

235229
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
236230

237-
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
231+
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
238232

239-
* Embed Multilingual has 1,024 dimensions
233+
* Embed Multilingual has 1,024 dimensions.
240234
* Context window of the model is 512 tokens
241-
* Embed Multilingual accepts images as a base64 encoded data url
242235

243-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
244236

245237
---
246238

@@ -419,23 +411,19 @@ The Cohere family of models for embeddings includes the following models:
419411

420412
# [Cohere Embed v3 - English](#tab/cohere-embed-v3-english)
421413

422-
Cohere Embed English is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
414+
Cohere Embed English is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English performs well on the HuggingFace (massive text embed) MTEB benchmark and on use-cases for various industries, such as Finance, Legal, and General-Purpose Corpora. Embed English also has the following attributes:
423415

424-
* Embed English has 1,024 dimensions
416+
* Embed English has 1,024 dimensions.
425417
* Context window of the model is 512 tokens
426-
* Embed English accepts images as a base64 encoded data url
427418

428-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
429419

430420
# [Cohere Embed v3 - Multilingual](#tab/cohere-embed-v3-multilingual)
431421

432-
Cohere Embed Multilingual is a multimodal (text and image) representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
422+
Cohere Embed Multilingual is a text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Embed multilingual performs well on multilingual benchmarks such as Miracl. Embed Multilingual also has the following attributes:
433423

434-
* Embed Multilingual has 1,024 dimensions
424+
* Embed Multilingual has 1,024 dimensions.
435425
* Context window of the model is 512 tokens
436-
* Embed Multilingual accepts images as a base64 encoded data url
437426

438-
Image embeddings consume a fixed number of tokens per image—1,000 tokens per image—which translates to a price of $0.0001 per image embedded. The size or resolution of the image doesn't affect the number of tokens consumed, provided the image is within the accepted dimensions, file size, and formats.
439427

440428
---
441429

0 commit comments

Comments
 (0)