Skip to content

Commit 67e1ce3

Browse files
authored
Update model-catalog-overview.md
1 parent 66d1766 commit 67e1ce3

File tree

1 file changed

+26
-19
lines changed

1 file changed

+26
-19
lines changed

articles/ai-studio/how-to/model-catalog-overview.md

Lines changed: 26 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,8 @@ The model catalog in Azure AI Studio is the hub to discover and use a wide range
2323

2424
## Model collections
2525

26-
The model catalog organizes models into three collections:
26+
The model catalog organizes models into different collections:
27+
2728

2829
* **Curated by Azure AI**: The most popular non-Microsoft open-weight and proprietary models packaged and optimized to work seamlessly on the Azure AI platform. Use of these models is subject to the model providers' license terms. When you deploy these models in Azure AI Studio, their availability is subject to the applicable [Azure service-level agreement (SLA)](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services), and Microsoft provides support for deployment problems.
2930

@@ -37,46 +38,52 @@ You can submit a request to add a model to the model catalog by using [this form
3738

3839
## Overview of model catalog capabilities
3940

40-
You can deploy some models in the **Curated by Azure AI** and **Open models from the Hugging Face hub** collections with a managed compute option. Some models are available to be deployed through serverless APIs with pay-as-you-go billing.
41+
You can search and discover models that meet your need through `keyword search` and `filters`. Model catalog also offers the model performance benchmark metrics for select models. You can access the benchmark by clicking `Compare Models` or from the model card Benchmark tab.
42+
43+
On the model card, you'll find:
4144

42-
You can discover, compare, evaluate, fine-tune (when supported), and deploy these models at scale. You can then integrate the models into your generative AI applications with enterprise-grade security and data governance. The following list describes the activities in detail:
45+
* **Quick facts**: you will see key information about the model at a quick glance.
46+
* **Details**: this page contains the detailed information about the model, including description, version info, supported data type, etc.
47+
* **Benchmarks**: you will find performance benchmark metrics for select models.
48+
* **Existing deployments**: if you have already deployed the model, you can find it under Existing deployments tab.
49+
* **Code samples**: you will find the basic code samples to get started with AI application development.
50+
* **License**: you will find legal information related to model licensing.
51+
* **Artifacts**: this tab will be displayed for open models only. You can see the model assets and download them via user interface.
4352

44-
* **Discover**: Review model cards, try sample inference, and browse code samples to evaluate, fine-tune, or deploy the model.
45-
* **Compare**: Compare benchmarks across models and datasets available in the industry to assess which one meets your business scenario.
46-
* **Evaluate**: Evaluate if the model is suited for your specific workload by providing your own test data. Use evaluation metrics to visualize how well the selected model performs in your scenario.
47-
* **Fine-tune**: Customize fine-tunable models by using your own training data, and choose the best model by comparing metrics across all your fine-tuning jobs. Built-in optimizations speed up fine-tuning and reduce the required memory and compute.
48-
* **Deploy**: Deploy pretrained models or fine-tuned models seamlessly for inference. You can also download models that can be deployed to managed compute.
53+
## Model deployment: Azure OpenAI
4954

5055
For more information on Azure OpenAI models, see [What is Azure OpenAI Service?](../../ai-services/openai/overview.md).
5156

52-
## Model deployment: Managed compute and serverless API (pay-as-you-go)
57+
## Model deployment: Managed compute and serverless APIs
5358

54-
The model catalog offers two distinct ways to deploy models for your use: managed compute and serverless APIs.
59+
In addition to Azure OpenAI Service models, the model catalog offers two distinct ways to deploy models for your use: managed compute and serverless APIs.
5560

5661
The deployment options and features available for each model vary, as described in the following tables. [Learn more about data processing with the deployment options]( concept-data-privacy.md).
5762

5863
### Capabilities of model deployment options
5964
<!-- docutune:disable -->
6065

61-
Features | Managed compute | Serverless API (pay-as-you-go)
66+
Features | Managed compute | Serverless API (pay-per-token)
6267
--|--|--
6368
Deployment experience and billing | Model weights are deployed to dedicated virtual machines with managed compute. A managed compute, which can have one or more deployments, makes available a REST API for inference. You're billed for the virtual machine core hours that the deployments use. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model that Microsoft hosts and manages, for inference. You're billed for inputs and outputs to the APIs, typically in tokens. Pricing information is provided before you deploy.
64-
API authentication | Keys and Microsoft Entra authentication. | Keys only.
69+
API authentication | Keys and Microsoft Entra authentication. | Keys only.
6570
Content safety | Use Azure AI Content Safety service APIs. | Azure AI Content Safety filters are available integrated with inference APIs. Azure AI Content Safety filters are billed separately.
6671
Network isolation | [Configure managed networks for Azure AI Studio hubs](configure-managed-network.md). | Managed compute follow your hub's public network access (PNA) flag setting. For more information, see the [Network isolation for models deployed via Serverless APIs](#network-isolation-for-models-deployed-via-serverless-apis) section later in this article.
6772

6873
### Available models for supported deployment options
6974

70-
Model | Managed compute | Serverless API (pay-as-you-go)
75+
The following list contains Serverless API models. For Azure OpenAI models, see [Azure OpenAI Service Models](../../ai-services/openai/concepts/models.md).
76+
77+
Model | Managed compute | Serverless API (pay-per-token)
7178
--|--|--
72-
Llama family models | Llama-2-7b <br> Llama-2-7b-chat <br> Llama-2-13b <br> Llama-2-13b-chat <br> Llama-2-70b <br> Llama-2-70b-chat <br> Llama-3-8B-Instruct <br> Llama-3-70B-Instruct <br> Llama-3-8B <br> Llama-3-70B | Llama-3-70B-Instruct <br> Llama-3-8B-Instruct <br> Llama-2-7b <br> Llama-2-7b-chat <br> Llama-2-13b <br> Llama-2-13b-chat <br> Llama-2-70b <br> Llama-2-70b-chat
79+
Llama family models | Llama-3.2-3B-Instruct<BR> Llama-3.2-1B-Instruct<BR> Llama-3.2-1B<BR> Llama-3.2-90B-Vision-Instruct<BR> Llama-3.2-11B-Vision-Instruct<BR> Llama-3.1-8B-Instruct<BR> Llama-3.1-8B<BR> Llama-3.1-70B-Instruct<BR> Llama-3.1-70B<BR> Llama-3-8B-Instruct<BR> Llama-3-70B<BR> Llama-3-8B<BR> Llama-Guard-3-1B<BR> Llama-Guard-3-8B<BR> Llama-Guard-3-11B-Vision<BR> Llama-2-7b<BR> Llama-2-70b<BR> Llama-2-7b-chat<BR> Llama-2-13b-chat<BR> CodeLlama-7b-hf<BR> CodeLlama-7b-Instruct-hf<BR> CodeLlama-34b-hf<BR> CodeLlama-34b-Python-hf<BR> CodeLlama-34b-Instruct-hf<BR> CodeLlama-13b-Instruct-hf<BR> CodeLlama-13b-Python-hf<BR> Prompt-Guard-86M<BR> CodeLlama-70b-hf<BR> | Llama-3.2-90B-Vision-Instruct<br> Llama-3.2-11B-Vision-Instruct<br> Llama-3.1-8B-Instruct<br> Llama-3.1-70B-Instruct<br> Llama-3.1-405B-Instruct<br> Llama-3-8B-Instruct<br> Llama-3-70B-Instruct<br> Llama-2-7b<br> Llama-2-7b-chat<br> Llama-2-70b<br> Llama-2-70b-chat<br> Llama-2-13b<br> Llama-2-13b-chat<br>
7380
Mistral family models | mistralai-Mixtral-8x22B-v0-1 <br> mistralai-Mixtral-8x22B-Instruct-v0-1 <br> mistral-community-Mixtral-8x22B-v0-1 <br> mistralai-Mixtral-8x7B-v01 <br> mistralai-Mistral-7B-Instruct-v0-2 <br> mistralai-Mistral-7B-v01 <br> mistralai-Mixtral-8x7B-Instruct-v01 <br> mistralai-Mistral-7B-Instruct-v01 | Mistral-large (2402) <br> Mistral-large (2407) <br> Mistral-small <br> Ministral-3B <br> Mistral-NeMo
7481
Cohere family models | Not available | Cohere-command-r-plus-08-2024 <br> Cohere-command-r-08-2024 <br> Cohere-command-r-plus <br> Cohere-command-r <br> Cohere-embed-v3-english <br> Cohere-embed-v3-multilingual <br> Cohere-rerank-v3-english <br> Cohere-rerank-v3-multilingual
7582
JAIS | Not available | jais-30b-chat
76-
Healthcare AI Family Models | MedImageInsight <br> CxrReportGen <br> MedImageParse | Not Available
83+
AI21 family models | Not available | Jamba-1.5-Mini <br> Jamba-1.5-Large
84+
Healthcare AI family Models | MedImageParse<BR> MedImageInsight<BR> CxrReportGen<BR> Virchow<BR> Virchow2<BR> Prism<BR> BiomedCLIP-PubMedBERT<BR> microsoft-llava-med-v1.5<BR> m42-health-llama3-med4<BR> biomistral-biomistral-7b<BR> microsoft-biogpt-large-pub<BR> microsoft-biomednlp-pub<BR> stanford-crfm-biomedlm<BR> medicalai-clinicalbert<BR> microsoft-biogpt<BR> microsoft-biogpt-large<BR> microsoft-biomednlp-pub<BR> | Not Available
7785
Phi-3 family models | Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br> Phi-3-small-8k-Instruct <br> Phi-3-small-128k-Instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-instruct <br> Phi-3-vision-128k-Instruct <br> Phi-3.5-mini-Instruct <br> Phi-3.5-vision-Instruct <br> Phi-3.5-MoE-Instruct | Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br> Phi-3-small-8k-Instruct <br> Phi-3-small-128k-Instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-instruct <br> <br> Phi-3.5-mini-Instruct <br> Phi-3.5-vision-Instruct <br> Phi-3.5-MoE-Instruct
7886
Nixtla | Not available | TimeGEN-1
79-
Other models | Available | Not available
8087

8188
<!-- docutune:enable -->
8289

@@ -117,9 +124,9 @@ The [Azure AI Content Safety](../../ai-services/content-safety/overview.md) serv
117124

118125
You can refer to [this notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/inference/text-generation/llama-safe-online-deployment.ipynb) for reference integration with Azure AI Content Safety for Llama 2. Or you can use the Content Safety (Text) tool in prompt flow to pass responses from the model to Azure AI Content Safety for screening. You're billed separately for such use, as described in [Azure AI Content Safety pricing](https://azure.microsoft.com/pricing/details/cognitive-services/content-safety/).
119126

120-
## Serverless APIs with pay-as-you-go billing
127+
## Serverless API (pay-per-token) billing
121128

122-
You can deploy certain models in the model catalog as serverless APIs with pay-as-you-go billing. This deployment method, sometimes called *model as a service* (MaaS), provides a way to consume the models as APIs without hosting them on your subscription. Models are hosted in a Microsoft-managed infrastructure, which enables API-based access to the model provider's model. API-based access can dramatically reduce the cost of accessing a model and simplify the provisioning experience.
129+
You can deploy certain models in the model catalog with pay-per-token billing. This deployment method, also called *Serverless API*, provides a way to consume the models as APIs without hosting them on your subscription. Models are hosted in a Microsoft-managed infrastructure, which enables API-based access to the model provider's model. API-based access can dramatically reduce the cost of accessing a model and simplify the provisioning experience.
123130

124131
Models that are available for deployment as serverless APIs with pay-as-you-go billing are offered by the model provider, but they're hosted in a Microsoft-managed Azure infrastructure and accessed via API. Model providers define the license terms and set the price for use of their models. The Azure Machine Learning service:
125132

@@ -149,7 +156,7 @@ In Azure AI Studio, you can use vector indexes and retrieval-augmented generatio
149156

150157
### Regional availability of offers and models
151158

152-
Pay-as-you-go billing is available only to users whose Azure subscription belongs to a billing account in a country where the model provider has made the offer available. If the offer is available in the relevant region, the user then must have a Hub/Project in the Azure region where the model is available for deployment or fine-tuning, as applicable. See [Region availability for models in serverless API endpoints | Azure AI Studio](deploy-models-serverless-availability.md) for detailed information.
159+
Pay-per-token billing is available only to users whose Azure subscription belongs to a billing account in a country where the model provider has made the offer available. If the offer is available in the relevant region, the user then must have a project resource in the Azure region where the model is available for deployment or fine-tuning, as applicable. See [Region availability for models in serverless API endpoints | Azure AI Studio](deploy-models-serverless-availability.md) for detailed information.
153160

154161
### Content safety for models deployed via serverless APIs
155162

0 commit comments

Comments
 (0)