@@ -70,8 +70,8 @@ The following table lists the Cohere models that you can inference via the Azur
7070| ------ | ---- | --- |
7171| [ Cohere-command-r-plus-08-2024] ( https://ai.azure.com/explore/models/Cohere-command-r-plus-08-2024/version/1/registry/azureml-cohere ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** Yes <br /> - ** Response formats:** Text, JSON |
7272| [ Cohere-command-r-08-2024] ( https://ai.azure.com/explore/models/Cohere-command-r-08-2024/version/1/registry/azureml-cohere ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** Yes <br /> - ** Response formats:** Text, JSON |
73- | [ Cohere-command-r-plus] ( https://ai.azure.com/explore/models/Cohere-command-r-plus/version/1/registry/azureml-cohere ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** Yes <br /> - ** Response formats:** Text, JSON |
74- | [ Cohere-command-r] ( https://ai.azure.com/explore/models/Cohere-command-r/version/1/registry/azureml-cohere ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** Yes <br /> - ** Response formats:** Text, JSON |
73+ | [ Cohere-command-r-plus] ( https://ai.azure.com/explore/models/Cohere-command-r-plus/version/1/registry/azureml-cohere ) < br > (deprecated) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** Yes <br /> - ** Response formats:** Text, JSON |
74+ | [ Cohere-command-r] ( https://ai.azure.com/explore/models/Cohere-command-r/version/1/registry/azureml-cohere ) < br > (deprecated) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** Yes <br /> - ** Response formats:** Text, JSON |
7575| [ Cohere-embed-v3-english] ( https://ai.azure.com/explore/models/Cohere-embed-v3-english/version/1/registry/azureml-cohere ) | [ embeddings] ( ../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context ) <br /> [ image-embeddings] ( ../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (512 tokens) <br /> - ** Output:** Vector (1,024 dim.) |
7676| [ Cohere-embed-v3-multilingual] ( https://ai.azure.com/explore/models/Cohere-embed-v3-multilingual/version/1/registry/azureml-cohere ) | [ embeddings] ( ../model-inference/how-to/use-embeddings.md?context=/azure/ai-foundry/context/context ) <br /> [ image-embeddings] ( ../model-inference/how-to/use-image-embeddings.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (512 tokens) <br /> - ** Output:** Vector (1,024 dim.) |
7777
@@ -108,8 +108,8 @@ The following table lists the Cohere rerank models. To perform inferencing with
108108| Model | Type | Inference API |
109109| ------ | ---- | --- |
110110| [ Cohere-rerank-v3.5] ( https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere ) | rerank <br > text classification | [ Cohere's v2/rerank API] ( https://docs.cohere.com/v2/reference/rerank ) |
111- | [ Cohere-rerank-v3-english] ( https://ai.azure.com/explore/models/Cohere-rerank-v3-english/version/1/registry/azureml-cohere ) | rerank <br > text classification | [ Cohere's v2/rerank API] ( https://docs.cohere.com/v2/reference/rerank ) <br > [ Cohere's v1/rerank API] ( https://docs.cohere.com/v1/reference/rerank ) |
112- | [ Cohere-rerank-v3-multilingual] ( https://ai.azure.com/explore/models/Cohere-rerank-v3-multilingual/version/1/registry/azureml-cohere ) | rerank <br > text classification | [ Cohere's v2/rerank API] ( https://docs.cohere.com/v2/reference/rerank ) <br > [ Cohere's v1/rerank API] ( https://docs.cohere.com/v1/reference/rerank ) |
111+ | [ Cohere-rerank-v3-english] ( https://ai.azure.com/explore/models/Cohere-rerank-v3-english/version/1/registry/azureml-cohere ) < br > (deprecated) | rerank <br > text classification | [ Cohere's v2/rerank API] ( https://docs.cohere.com/v2/reference/rerank ) <br > [ Cohere's v1/rerank API] ( https://docs.cohere.com/v1/reference/rerank ) |
112+ | [ Cohere-rerank-v3-multilingual] ( https://ai.azure.com/explore/models/Cohere-rerank-v3-multilingual/version/1/registry/azureml-cohere ) < br > (deprecated) | rerank <br > text classification | [ Cohere's v2/rerank API] ( https://docs.cohere.com/v2/reference/rerank ) <br > [ Cohere's v1/rerank API] ( https://docs.cohere.com/v1/reference/rerank ) |
113113
114114
115115#### Pricing for Cohere rerank models
@@ -178,10 +178,10 @@ Meta Llama models and tools are a collection of pretrained and fine-tuned genera
178178| [ Llama-3.2-90B-Vision-Instruct] ( https://ai.azure.com/explore/models/Llama-3.2-90B-Vision-Instruct/version/1/registry/azureml-meta ) | [ chat-completion (with images)] ( ../model-inference/how-to/use-chat-multi-modal.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text and image (128,000 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
179179| [ Llama-3.2-11B-Vision-Instruct] ( https://ai.azure.com/explore/models/Llama-3.2-11B-Vision-Instruct/version/1/registry/azureml-meta ) | [ chat-completion (with images)] ( ../model-inference/how-to/use-chat-multi-modal.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text and image (128,000 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
180180| [ Meta-Llama-3.1-8B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3.1-8B-Instruct/version/4/registry/azureml-meta ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
181- | [ Meta-Llama-3.1-70B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3.1-70B-Instruct/version/4/registry/azureml-meta ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
182181| [ Meta-Llama-3.1-405B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3.1-405B-Instruct/version/1/registry/azureml-meta ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
183- | [ Meta-Llama-3-8B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3-8B-Instruct/version/9/registry/azureml-meta ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (8,192 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
184- | [ Meta-Llama-3-70B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3-70B-Instruct/version/9/registry/azureml-meta ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (8,192 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
182+ | [ Meta-Llama-3.1-70B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3.1-70B-Instruct/version/4/registry/azureml-meta ) (deprecated)| [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
183+ | [ Meta-Llama-3-8B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3-8B-Instruct/version/9/registry/azureml-meta ) (deprecated)| [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (8,192 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
184+ | [ Meta-Llama-3-70B-Instruct] ( https://ai.azure.com/explore/models/Meta-Llama-3-70B-Instruct/version/9/registry/azureml-meta ) (deprecated)| [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (8,192 tokens) <br /> - ** Output:** text (8,192 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
185185
186186
187187See [ this model collection in Azure AI Foundry portal] ( https://ai.azure.com/explore/models?&selectedCollection=meta ) .
@@ -214,7 +214,7 @@ Phi is a family of lightweight, state-of-the-art open models. These models were
214214| [ Phi-3.5-MoE-instruct] ( https://ai.azure.com/explore/models/Phi-3.5-MoE-instruct/version/5/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
215215| [ Phi-3.5-vision-instruct] ( https://ai.azure.com/explore/models/Phi-3.5-vision-instruct/version/2/registry/azureml ) | [ chat-completion (with images)] ( ../model-inference/how-to/use-chat-multi-modal.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text and image (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
216216| [ Phi-3-mini-128k-instruct] ( https://ai.azure.com/explore/models/Phi-3-mini-128k-instruct/version/12/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
217- | [ Phi-3-mini-4k-instruct] ( https://ai.azure.com/explore/models/Phi-3-mini-4k-instruct/version/14/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (4,096 tokens) <br /> - ** Output:** tetxt (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
217+ | [ Phi-3-mini-4k-instruct] ( https://ai.azure.com/explore/models/Phi-3-mini-4k-instruct/version/14/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (4,096 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
218218| [ Phi-3-small-128k-instruct] ( https://ai.azure.com/explore/models/Phi-3-small-128k-instruct/version/4/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
219219| [ Phi-3-small-8k-instruct] ( https://ai.azure.com/explore/models/Phi-3-small-8k-instruct/version/5/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
220220| [ Phi-3-medium-128k-instruct] ( https://ai.azure.com/explore/models/Phi-3-medium-128k-instruct/version/6/registry/azureml ) | [ chat-completion] ( ../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context ) | - ** Input:** text (131,072 tokens) <br /> - ** Output:** text (4,096 tokens) <br /> - ** Tool calling:** No <br /> - ** Response formats:** Text |
0 commit comments