You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You have various options for deploying these models. For some models, you need to host them on your infrastructure, as in the case of deployment via managed compute, while for others, you can host them on Microsoft's servers, as in the case of deployment via serverless APIs. See [Available models for supported deployment options](../how-to/model-catalog-overview.md#available-models-for-supported-deployment-options) for a list of models in the catalog that are available for deployment via managed compute or serverless API.
22
+
You have various options for deploying these models. For some models, you need to host them on your infrastructure, as in the case of deployment via managed compute. For other models, you can host them on Microsoft's servers, as in the case of deployment via serverless APIs. See [Available models for supported deployment options](../how-to/model-catalog-overview.md#available-models-for-supported-deployment-options) for a list of models in the catalog that are available for deployment via managed compute or serverless API.
23
23
24
-
When it comes to performing inferencing with the models, some of these models are supported for inferencing using the [Azure AI model inference](../model-inference/overview.md), while others require you to use custom APIs from the model providers. You can find more details about individual models by reviewing their model cards in the [model catalog for Azure AI Foundry portal](https://ai.azure.com/explore/models).
24
+
When it comes to performing inferencing with the models, some of these models, such as [Nixtla's TimeGEN-1](#nixtla) and [Cohere rerank](#cohere-rerank), require you to use custom APIs from the model providers. Others that belong to the following model types are supported for inferencing using the [Azure AI model inference](../model-inference/overview.md):
You can find more details about individual models by reviewing their model cards in the [model catalog for Azure AI Foundry portal](https://ai.azure.com/explore/models).
26
33
27
34
:::image type="content" source="../media/models-featured/models-catalog.gif" alt-text="An animation showing Azure AI studio model catalog section and the models available." lightbox="../media/models-featured/models-catalog.gif":::
28
35
@@ -63,7 +70,11 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
63
70
64
71
## Cohere
65
72
66
-
The Cohere family of models includes various models optimized for different use cases, including rerank, chat completions, and embeddings. The following table lists the available Cohere rerank models. that can be accessed for inferencing, by using Cohere's rerank API. For other Cohere models that you can inference via the Azure AI model Inference, see [Cohere models](https://learn.microsoft.com/azure/ai-foundry/model-inference/concepts/models?context=%2Fazure%2Fai-studio%2Fcontext%2Fcontext#cohere).
73
+
The Cohere family of models includes various models optimized for different use cases, including rerank, chat completions, and embeddings.
74
+
75
+
### Cohere command and embed
76
+
77
+
The following table lists the Cohere models that you can inference via the Azure AI model Inference.
67
78
68
79
| Model | Type | Capabilities |
69
80
| ------ | ---- | --- |
@@ -78,6 +89,8 @@ The Cohere family of models includes various models optimized for different use
78
89
79
90
### Cohere rerank
80
91
92
+
The following table lists the Cohere rerank models. To perform inferencing with these rerank models, you're required to use Cohere's custom rerank APIs that are listed in the table.
93
+
81
94
| Model | Type | Inference API |
82
95
| ------ | ---- | --- |
83
96
|[Cohere-rerank-v3.5](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere)| rerank <br> text classification |[Cohere's v2/rerank API](https://docs.cohere.com/v2/reference/rerank)|
@@ -103,7 +116,7 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
103
116
104
117
### DeepSeek
105
118
106
-
DeepSeek family of models include DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, and DeepSeek-V3, a Mixture-of-Experts (MoE) language model.
119
+
DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, and DeepSeek-V3, a Mixture-of-Experts (MoE) language model.
107
120
108
121
| Model | Type | Capabilities |
109
122
| ------ | ---- | --- |
@@ -151,8 +164,7 @@ Phi is a family of lightweight, state-of-the-art open models. These models were
151
164
|[Phi-3.5-mini-instruct](https://ai.azure.com/explore/models/Phi-3.5-mini-instruct/version/6/registry/azureml)|[chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context)| - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
152
165
|[Phi-4](https://ai.azure.com/explore/models/Phi-4/version/2/registry/azureml)|[chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context)| - **Input:** text (16,384 tokens) <br /> - **Output:** (16,384 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
153
166
|[Phi-4-mini-instruct](https://ai.azure.com/explore/models/Phi-4-mini-instruct/version/1/registry/azureml)|[chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context)| - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
154
-
|[Phi-4-multimodal-instruct](https://ai.azure.com/explore/models/Phi-4-multimodal-instruct/version/1/registry/azureml)|[chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context)| - **Input:** text, images, and audio (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
155
-
167
+
|[Phi-4-multimodal-instruct](https://ai.azure.com/explore/models/Phi-4-multimodal-instruct/version/1/registry/azureml)|[chat-completion (with image and audio content)](../model-inference/how-to/use-chat-multi-modal.md?context=/azure/ai-foundry/context/context)| - **Input:** text, images, and audio (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
156
168
157
169
158
170
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=phi).
@@ -177,6 +189,8 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
177
189
178
190
Nixtla's TimeGEN-1 is a generative pre-trained forecasting and anomaly detection model for time series data. TimeGEN-1 can produce accurate forecasts for new time series without training, using only historical values and exogenous covariates as inputs.
179
191
192
+
To perform inferencing, TimeGEN-1 requires you to use Nixtla's custom inference API.
193
+
180
194
| Model | Type | Capabilities | Inference API|
181
195
| ------ | ---- | --- | ------------ |
182
196
|[TimeGEN-1](https://ai.azure.com/explore/models/TimeGEN-1/version/1/registry/azureml-nixtla)| Forecasting | - **Input:** Time series data as JSON or dataframes (with support for multivariate input) <br /> - **Output:** Time series data as JSON <br /> - **Tool calling:** No <br /> - **Response formats:** JSON |[Forecast client to interact with Nixtla's API](https://nixtlaverse.nixtla.io/nixtla/docs/reference/nixtla_client.html#nixtlaclient-forecast)|
0 commit comments