Skip to content

Commit c2997d3

Browse files
committed
new articles
1 parent 9229a3c commit c2997d3

File tree

1 file changed

+19
-19
lines changed
  • articles/ai-foundry/model-inference/concepts

1 file changed

+19
-19
lines changed

articles/ai-foundry/model-inference/concepts/models.md

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Models available in Azure AI model inference
33
titleSuffix: Azure AI Foundry
4-
description: Explore the models available in the Azure AI model inference and their capabilities
4+
description: Explore the models available via the Azure AI model inference and their capabilities.
55
manager: scottpolly
66
author: msakande
77
reviewer: santiagxf
@@ -26,16 +26,16 @@ Learn more about specific deployment capabilities for Azure OpenAI at [Azure Ope
2626
> [!TIP]
2727
> The Azure AI model catalog offers a larger selection of models, from a bigger range of providers. However, those models might require you to host them on your infrastructure, including the creation of an AI hub and project. Azure AI model service provides a way to consume the models as APIs without hosting them on your infrastructure, with a pay-as-you-go billing. Learn more about the [Azure AI model catalog](../../../ai-studio/how-to/model-catalog-overview.md).
2828
29-
You can see all the models available to you in the [model catalog for Azure AI Foundry](https://ai.azure.com/explore/models).
29+
You can see all the models available to you in the [model catalog for Azure AI Foundry portal](https://ai.azure.com/explore/models).
3030

3131
### AI21 Labs
3232

3333
The Jamba family models are AI21's production-grade Mamba-based large language model (LLM) which uses AI21's hybrid Mamba-Transformer architecture. It's an instruction-tuned version of AI21's hybrid structured state space model (SSM) transformer Jamba model. The Jamba family models are built for reliable commercial use with respect to quality and performance.
3434

35-
| Model | Type | SKU | Capabilities |
35+
| Model | Type | Tier | Capabilities |
3636
| ------ | ---- | --- | ------------ |
37-
| [AI21-Jamba-1.5-Mini](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Mini/version/1/registry/azureml-ai21) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, es, pt, de, ar, and he <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
38-
| [AI21-Jamba-1.5-Large](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Large/version/1/registry/azureml-ai21) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en, fr, es, pt, de, ar, and he <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
37+
| [AI21-Jamba-1.5-Mini](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Mini/version/1/registry/azureml-ai21) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** `en`, `fr`, `es`, `pt`, `de`, `ar`, and `he` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
38+
| [AI21-Jamba-1.5-Large](https://ai.azure.com/explore/models/AI21-Jamba-1.5-Large/version/1/registry/azureml-ai21) | chat-completion | Global standard | - **Input:** text (262,144 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** `en`, `fr`, `es`, `pt`, `de`, `ar`, and `he` <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
3939

4040

4141
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=ai21).
@@ -48,16 +48,16 @@ Azure OpenAI Service offers a diverse set of models with different capabilities
4848
- Models that can understand and generate natural language and code
4949
- Models that can transcribe and translate speech to text
5050

51-
| Model | Type | SKU | Capabilities |
51+
| Model | Type | Tier | Capabilities |
5252
| ------ | ---- | --- | ------------ |
53-
| [o1](https://ai.azure.com/explore/models/o1/version/2024-12-17/registry/azure-openai) | chat-completion | Global standard | - **Input:** text and image (200,000 tokens) <br /> - **Output:** text (100,000 tokens) <br /> - **Languages:** en, it, af, es, de, fr, id, ru, pl, uk, el, lv, zh, ar, tr, ja, sw, cy, ko, is, bn, ur, ne, th, pa, mr, and te <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
54-
| [o1-preview](https://ai.azure.com/explore/models/o1-preview/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard<br /> | - **Input:** text (128,000 tokens) <br /> - **Output:** (32,768 tokens) <br /> - **Languages:** en, it, af, es, de, fr, id, ru, pl, uk, el, lv, zh, ar, tr, ja, sw, cy, ko, is, bn, ur, ne, th, pa, mr, and te <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
55-
| [o1-mini](https://ai.azure.com/explore/models/o1-mini/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard | - **Input:** text (128,000 tokens) <br /> - **Output:** (65,536 tokens) <br /> - **Languages:** en, it, af, es, de, fr, id, ru, pl, uk, el, lv, zh, ar, tr, ja, sw, cy, ko, is, bn, ur, ne, th, pa, mr, and te <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
53+
| [o1](https://ai.azure.com/explore/models/o1/version/2024-12-17/registry/azure-openai) | chat-completion | Global standard | - **Input:** text and image (200,000 tokens) <br /> - **Output:** text (100,000 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
54+
| [o1-preview](https://ai.azure.com/explore/models/o1-preview/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard<br /> | - **Input:** text (128,000 tokens) <br /> - **Output:** (32,768 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
55+
| [o1-mini](https://ai.azure.com/explore/models/o1-mini/version/1/registry/azure-openai) | chat-completion | Global standard <br />Standard | - **Input:** text (128,000 tokens) <br /> - **Output:** (65,536 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
5656
| [gpt-4o-realtime-preview](https://ai.azure.com/explore/models/gpt-4o-realtime-preview/version/2024-10-01/registry/azure-openai) | real-time | Global standard | - **Input:** control, text, and audio (131,072 tokens) <br /> - **Output:** text and audio (16,384 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
57-
| [gpt-4o](https://ai.azure.com/explore/models/gpt-4o/version/2024-11-20/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text and image (131,072 tokens) <br /> - **Output:** text (16,384 tokens) <br /> - **Languages:** en, it, af, es, de, fr, id, ru, pl, uk, el, lv, zh, ar, tr, ja, sw, cy, ko, is, bn, ur, ne, th, pa, mr, and te <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
58-
| [gpt-4o-mini](https://ai.azure.com/explore/models/gpt-4o-mini/version/2024-07-18/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text, image, and audio (131,072 tokens) <br /> - **Output:** (16,384 tokens) <br /> - **Languages:** en, it, af, es, de, fr, id, ru, pl, uk, el, lv, zh, ar, tr, ja, sw, cy, ko, is, bn, ur, ne, th, pa, mr, and te <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
59-
| [text-embedding-3-large](https://ai.azure.com/explore/models/text-embedding-3-large/version/1/registry/azure-openai) | embeddings | Global standard <br />Standard<br />Provisioned<br />Global provisioned | - **Input:** text (8,191 tokens) <br /> - **Output:** Vector (3,072 dim.) <br /> - **Languages:** en |
60-
| [text-embedding-3-small](https://ai.azure.com/explore/models/text-embedding-3-small/version/1/registry/azure-openai) | embeddings | Global standard <br />Standard<br />Provisioned<br />Global provisioned | - **Input:** text (8,191 tokens) <br /> - **Output:** Vector (1,536 dim.) <br /> - **Languages:** en |
57+
| [gpt-4o](https://ai.azure.com/explore/models/gpt-4o/version/2024-11-20/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text and image (131,072 tokens) <br /> - **Output:** text (16,384 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
58+
| [gpt-4o-mini](https://ai.azure.com/explore/models/gpt-4o-mini/version/2024-07-18/registry/azure-openai) | chat-completion | Global standard <br />Standard<br />Batch<br />Provisioned<br />Global provisioned<br />Data Zone | - **Input:** text, image, and audio (131,072 tokens) <br /> - **Output:** (16,384 tokens) <br /> - **Languages:** `en`, `it`, `af`, `es`, `de`, `fr`, `id`, `ru`, `pl`, `uk`, `el`, `lv`, `zh`, `ar`, `tr`, `ja`, `sw`, `cy`, `ko`, `is`, `bn`, `ur`, `ne`, `th`, `pa`, `mr`, and `te`. <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON, structured outputs |
59+
| [text-embedding-3-large](https://ai.azure.com/explore/models/text-embedding-3-large/version/1/registry/azure-openai) | embeddings | Global standard <br />Standard<br />Provisioned<br />Global provisioned | - **Input:** text (8,191 tokens) <br /> - **Output:** Vector (3,072 dim.) <br /> - **Languages:** `en` |
60+
| [text-embedding-3-small](https://ai.azure.com/explore/models/text-embedding-3-small/version/1/registry/azure-openai) | embeddings | Global standard <br />Standard<br />Provisioned<br />Global provisioned | - **Input:** text (8,191 tokens) <br /> - **Output:** Vector (1,536 dim.) <br /> - **Languages:** `en` |
6161

6262

6363
See [this model collection in Azure AI Foundry portal](https://ai.azure.com/explore/models?&selectedCollection=aoai).
@@ -67,7 +67,7 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
6767

6868
The Cohere family of models includes various models optimized for different use cases, including chat completions and embeddings. Cohere models are optimized for various use cases that include reasoning, summarization, and question answering.
6969

70-
| Model | Type | SKU | Capabilities |
70+
| Model | Type | Tier | Capabilities |
7171
| ------ | ---- | --- | ------------ |
7272
| [Cohere-embed-v3-english](https://ai.azure.com/explore/models/Cohere-embed-v3-english/version/1/registry/azureml-cohere) | embeddings | Global standard | - **Input:** text (512 tokens) <br /> - **Output:** Vector (1,024 dim.) <br /> - **Languages:** en |
7373
| [Cohere-embed-v3-multilingual](https://ai.azure.com/explore/models/Cohere-embed-v3-multilingual/version/1/registry/azureml-cohere) | embeddings | Global standard | - **Input:** text (512 tokens) <br /> - **Output:** Vector (1,024 dim.) <br /> - **Languages:** en, fr, es, it, de, pt-br, ja, ko, zh-cn, and ar |
@@ -83,7 +83,7 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
8383

8484
Core42 includes autoregressive bi-lingual LLMs for Arabic & English with state-of-the-art capabilities in Arabic.
8585

86-
| Model | Type | SKU | Capabilities |
86+
| Model | Type | Tier | Capabilities |
8787
| ------ | ---- | --- | ------------ |
8888
| [jais-30b-chat](https://ai.azure.com/explore/models/jais-30b-chat/version/1/registry/azureml-core42) | chat-completion | Global standard | - **Input:** text (8,192 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en and ar <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
8989

@@ -98,7 +98,7 @@ Meta Llama models and tools are a collection of pretrained and fine-tuned genera
9898
- Mid-size large language models (LLMs) like 7B, 8B, and 70B Base and Instruct models
9999
- High-performant models like Meta Llama 3.1-405B Instruct for synthetic data generation and distillation use cases.
100100

101-
| Model | Type | SKU | Capabilities |
101+
| Model | Type | Tier | Capabilities |
102102
| ------ | ---- | --- | ------------ |
103103
| [Llama-3.3-70B-Instruct](https://ai.azure.com/explore/models/Llama-3.3-70B-Instruct/version/4/registry/azureml-meta) | chat-completion | Global standard | - **Input:** text (128,000 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Languages:** en, de, fr, it, pt, hi, es, and th <br /> - **Tool calling:** No* <br /> - **Response formats:** Text |
104104
| [Llama-3.2-11B-Vision-Instruct](https://ai.azure.com/explore/models/Llama-3.2-11B-Vision-Instruct/version/1/registry/azureml-meta) | chat-completion | Global standard | - **Input:** text and image (128,000 tokens) <br /> - **Output:** (8,192 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No* <br /> - **Response formats:** Text |
@@ -116,7 +116,7 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
116116

117117
Phi is a family of lightweight, state-of-the-art open models. These models were trained with Phi-3 datasets. The datasets include both synthetic data and the filtered, publicly available websites data, with a focus on high quality and reasoning-dense properties. The models underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.
118118

119-
| Model | Type | SKU | Capabilities |
119+
| Model | Type | Tier | Capabilities |
120120
| ------ | ---- | --- | ------------ |
121121
| [Phi-3-mini-128k-instruct](https://ai.azure.com/explore/models/Phi-3-mini-128k-instruct/version/12/registry/azureml) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
122122
| [Phi-3-mini-4k-instruct](https://ai.azure.com/explore/models/Phi-3-mini-4k-instruct/version/14/registry/azureml) | chat-completion | Global standard | - **Input:** text (4,096 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** en <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
@@ -137,7 +137,7 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
137137

138138
Mistral AI offers two categories of models: premium models including Mistral Large and Mistral Small and open models including Mistral Nemo.
139139

140-
| Model | Type | SKU | Capabilities |
140+
| Model | Type | Tier | Capabilities |
141141
| ------ | ---- | --- | ------------ |
142142
| [Ministral-3B](https://ai.azure.com/explore/models/Ministral-3B/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:** text (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
143143
| [Mistral-large](https://ai.azure.com/explore/models/Mistral-large/version/1/registry/azureml-mistral) | chat-completion | Global standard | - **Input:** text (32,768 tokens) <br /> - **Output:** (4,096 tokens) <br /> - **Languages:** fr, de, es, it, and en <br /> - **Tool calling:** Yes <br /> - **Response formats:** Text, JSON |
@@ -153,7 +153,7 @@ See [this model collection in Azure AI Foundry portal](https://ai.azure.com/expl
153153

154154
**Tsuzumi** is an autoregressive language optimized transformer. The tuned versions use supervised fine-tuning (SFT). Tsuzumi is handles both Japanese and English language with high efficiency.
155155

156-
| Model | Type | SKU | Capabilities |
156+
| Model | Type | Tier | Capabilities |
157157
| ------ | ---- | --- | ------------ |
158158
| [Tsuzumi-7b](https://ai.azure.com/explore/models/Tsuzumi-7b/version/1/registry/azureml-nttdata) | chat-completion | Global standard | - **Input:** text (8,192 tokens) <br /> - **Output:** text (8,192 tokens) <br /> - **Languages:** en and jp <br /> - **Tool calling:** No <br /> - **Response formats:** Text |
159159

0 commit comments

Comments
 (0)