Skip to content

Commit d7f9f54

Browse files
committed
split up pricing info
1 parent fe644a2 commit d7f9f54

File tree

1 file changed

+11
-4
lines changed

1 file changed

+11
-4
lines changed

articles/ai-studio/how-to/deploy-models-cohere-rerank.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,6 @@ Cohere offers rerank models in [Azure AI Foundry](https://ai.azure.com). These m
3030

3131
You can browse the Cohere family of models in the [Model Catalog](model-catalog.md) by filtering on the Cohere collection.
3232

33-
#### Pricing for Cohere Rerank models
34-
35-
*Queries*, not to be confused with a user's query, is a pricing meter that refers to the cost associated with the tokens used as input for inference of a Cohere Rerank model. Cohere counts a single search unit as a query with up to 100 documents to be ranked. Documents longer than 4096 tokens when including the length of the search query are split up into multiple chunks, where each chunk counts as a single document.
36-
3733
# [Cohere Rerank v3.5](#tab/cohere-rerank-3-5)
3834

3935
Cohere Rerank 3.5 provides a significant boost to the relevancy of search results. This AI model, also known as a cross-encoder, precisely sorts lists of documents according to their semantic similarity to a provided query. This action allows information retrieval systems to go beyond keyword search, and also outperform traditional embedding models, surfacing the most contextually relevant data within end-user applications.
@@ -43,6 +39,10 @@ Businesses use Cohere Rerank 3.5 to improve their enterprise search and retrieva
4339
- Context window of the model is 4,096 tokens
4440
- Max query length is 4,096 tokens
4541

42+
#### Pricing for Cohere Rerank v3.5
43+
44+
*Queries*, not to be confused with a user's query, is a pricing meter that refers to the cost associated with the tokens used as input for inference of a Cohere Rerank model. Cohere counts a single search unit as a query with up to 100 documents to be ranked. Documents longer than 500 tokens when including the length of the search query are split up into multiple chunks, where each chunk counts as a single document.
45+
4646
# [Cohere Rerank v3 - English](#tab/cohere-rerank-3-en)
4747

4848
Cohere Rerank English is a reranking model used for semantic search and retrieval-augmented generation (RAG). Rerank enables you to significantly improve search quality by augmenting traditional keyword-based search systems with a semantic-based reranking system that can contextualize the meaning of a user's query beyond keyword relevance. Cohere's Rerank delivers higher quality results than embedding-based search, lexical search, and even hybrid search, and it requires only adding a single line of code into your application.
@@ -56,6 +56,10 @@ Rerank supports JSON objects as documents where users can specify, at query time
5656

5757
Rerank English works well for code retrieval, semi-structured data retrieval, and long context.
5858

59+
#### Pricing for Cohere Rerank v3 English
60+
61+
*Queries*, not to be confused with a user's query, is a pricing meter that refers to the cost associated with the tokens used as input for inference of a Cohere Rerank model. Cohere counts a single search unit as a query with up to 100 documents to be ranked. Documents longer than 4096 tokens when including the length of the search query are split up into multiple chunks, where each chunk counts as a single document.
62+
5963
# [Cohere Rerank v3 - Multilingual](#tab/cohere-rerank-3-multi)
6064

6165
Cohere Rerank Multilingual is a reranking model used for semantic search and retrieval-augmented generation (RAG). Rerank Multilingual supports more than 100 languages and can be used to search within a language (for example, to search with a French query on French documents) and across languages (for example, to search with an English query on Chinese documents). Rerank enables you to significantly improve search quality by augmenting traditional keyword-based search systems with a semantic-based reranking system that can contextualize the meaning of a user's query beyond keyword relevance. Cohere's Rerank delivers higher quality results than embedding-based search, lexical search, and even hybrid search, and it requires only adding a single line of code into your application.
@@ -69,6 +73,9 @@ Rerank supports JSON objects as documents where users can specify, at query time
6973

7074
Rerank multilingual performs well on multilingual benchmarks such as Miracl.
7175

76+
#### Pricing for Cohere Rerank v3 Multilingual
77+
78+
*Queries*, not to be confused with a user's query, is a pricing meter that refers to the cost associated with the tokens used as input for inference of a Cohere Rerank model. Cohere counts a single search unit as a query with up to 100 documents to be ranked. Documents longer than 4096 tokens when including the length of the search query are split up into multiple chunks, where each chunk counts as a single document.
7279
---
7380

7481
## Deploy Cohere Rerank models as serverless APIs

0 commit comments

Comments
 (0)