You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/concepts/models-inference-examples.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ ms.author: mopeakande
7
7
manager: scottpolly
8
8
reviewer: santiagxf
9
9
ms.reviewer: fasantia
10
-
ms.date: 07/10/2025
10
+
ms.date: 07/11/2025
11
11
ms.service: azure-ai-foundry
12
12
ms.topic: concept-article
13
13
ms.custom:
@@ -57,7 +57,7 @@ The following table provides links to examples of how to use Cohere models.
57
57
58
58
### Cohere rerank
59
59
60
-
To perform inferencing with Cohere rerank models, you're required to use Cohere's custom rerank APIs. For more information, see the table for [Other Foundry Models available for serverless API deployment](../foundry-models/concepts/models.md#other-foundry-models-available-for-serverless-api-deployment).
60
+
To perform inferencing with Cohere rerank models, you're required to use Cohere's custom rerank APIs. For more information on the Cohere rerank model and its capabilities, see [Cohere rerank](../foundry-models/concepts/models.md#cohere-rerank).
61
61
62
62
63
63
#### Pricing for Cohere rerank models
@@ -163,7 +163,7 @@ The following table provides links to examples of how to use Mistral models.
163
163
164
164
Nixtla's TimeGEN-1 is a generative pre-trained forecasting and anomaly detection model for time series data. TimeGEN-1 can produce accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
165
165
166
-
To perform inferencing, TimeGEN-1 requires you to use Nixtla's custom inference API.
166
+
To perform inferencing, TimeGEN-1 requires you to use Nixtla's custom inference API. For more information on the TimeGEN-1 model and its capabilities, see [Nixtla](../foundry-models/concepts/models.md#nixtla).
For more details on pricing for Cohere rerank models, see [Pricing for Cohere rerank models](../../concepts/models-inference-examples.md#pricing-for-cohere-rerank-models).
137
138
138
139
See [the Cohere model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=cohere).
139
140
@@ -161,7 +162,7 @@ Meta Llama models and tools are a collection of pretrained and fine-tuned genera
|[Llama-4-Scout-17B-16E-Instruct](https://aka.ms/aifoundry/landing/llama-4-scout-17b-16e-instruct)|[chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context)| - **Input:** text and image (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text | Foundry, Hub-based |
165
+
|[Llama-4-Scout-17B-16E-Instruct](https://aka.ms/aifoundry/landing/llama-4-scout-17b-16e-instruct)| chat-completion | - **Input:** text and image (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text | Foundry, Hub-based |
165
166
166
167
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=meta). There are also several Meta models available as [models sold directly by Azure](#meta-models-sold-directly-by-azure).
167
168
@@ -174,8 +175,8 @@ Microsoft models include various model groups such as MAI models, Phi models, he
|[Phi-4-reasoning](https://aka.ms/azureai/landing/Phi-4-reasoning)|[chat-completion with reasoning content](../model-inference/how-to/use-chat-reasoning.md?context=/azure/ai-foundry/context/context)| - **Input:** text (32,768 tokens) <br /> - **Output:** text (32,768 tokens) <br /> - **Languages:**`en` <br /> - **Tool calling:** No <br /> - **Response formats:** Text | Foundry, Hub-based |
178
-
|[Phi-4-mini-reasoning](https://aka.ms/azureai/landing/Phi-4-mini-reasoning)|[chat-completion with reasoning content](../model-inference/how-to/use-chat-reasoning.md?context=/azure/ai-foundry/context/context)| - **Input:** text (128,000 tokens) <br /> - **Output:** text (128,000 tokens) <br /> - **Languages:**`en` <br /> - **Tool calling:** No <br /> - **Response formats:** Text | Foundry, Hub-based |
178
+
|[Phi-4-reasoning](https://aka.ms/azureai/landing/Phi-4-reasoning)| chat-completion with reasoning content | - **Input:** text (32,768 tokens) <br /> - **Output:** text (32,768 tokens) <br /> - **Languages:**`en` <br /> - **Tool calling:** No <br /> - **Response formats:** Text | Foundry, Hub-based |
179
+
|[Phi-4-mini-reasoning](https://aka.ms/azureai/landing/Phi-4-mini-reasoning)| chat-completion with reasoning content | - **Input:** text (128,000 tokens) <br /> - **Output:** text (128,000 tokens) <br /> - **Languages:**`en` <br /> - **Tool calling:** No <br /> - **Response formats:** Text | Foundry, Hub-based |
179
180
180
181
See [the Microsoft model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=phi). There are also several Microsoft models available as [models sold directly by Azure](#microsoft-models-sold-directly-by-azure).
181
182
@@ -210,6 +211,9 @@ To perform inferencing, TimeGEN-1 requires you to use Nixtla's custom inference
|[TimeGEN-1](https://ai.azure.com/explore/models/TimeGEN-1/version/1/registry/azureml-nixtla)| Forecasting | - **Input:** Time series data as JSON or dataframes (with support for multivariate input) <br /> - **Output:** Time series data as JSON <br /> - **Tool calling:** No <br /> - **Response formats:** JSON |[Forecast client to interact with Nixtla's API](https://nixtlaverse.nixtla.io/nixtla/docs/reference/nixtla_client.html#nixtlaclient-forecast)| Hub-based |
212
213
214
+
For more details on pricing for Nixtla models, see [Nixtla](../../concepts/models-inference-examples.md#nixtla).
215
+
216
+
213
217
### NTT Data
214
218
215
219
**tsuzumi** is an autoregressive language optimized transformer. The tuned versions use supervised fine-tuning (SFT). tsuzumi handles both Japanese and English language with high efficiency.
0 commit comments