Skip to content

Commit 7d684aa

Browse files
Merge pull request #281097 from santiagxf/santiagxf-patch-1
Update model-catalog-overview.md
2 parents d4174d3 + b8ca02b commit 7d684aa

File tree

1 file changed

+9
-15
lines changed

1 file changed

+9
-15
lines changed

articles/ai-studio/how-to/model-catalog-overview.md

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,12 @@ Some models in the **Curated by Azure AI** and **Open models from the Hugging Fa
4747
Model Catalog offers two distinct ways to deploy models from the catalog for your use: managed compute and serverless APIs. The deployment options available for each model vary; learn more about the features of the deployment options, and the options available for specific models, in the following tables. Learn more about [data processing]( concept-data-privacy.md) with the deployment options.
4848
<!-- docutune:disable -->
4949

50-
Features | Managed compute | serverless API (pay-as-you-go)
50+
Features | Managed compute | Serverless API (pay-as-you-go)
5151
--|--|--
52-
Deployment experience and billing | Model weights are deployed to dedicated Virtual Machines with Managed Online Endpoints. The managed online endpoint, which can have one or more deployments, makes available a REST API for inference. You're billed for the Virtual Machine core hours used by the deployments. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model hosted and managed by Microsoft, for inference. This mode of access is referred to as "Models as a Service". You're billed for inputs and outputs to the APIs, typically in tokens; pricing information is provided before you deploy.
52+
Deployment experience and billing | Model weights are deployed to dedicated Virtual Machines with Managed Online Endpoints. The managed online endpoint, which can have one or more deployments, makes available a REST API for inference. You're billed for the Virtual Machine core hours used by the deployments. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model hosted and managed by Microsoft, for inference. You're billed for inputs and outputs to the APIs, typically in tokens; pricing information is provided before you deploy.
5353
| API authentication | Keys and Microsoft Entra ID authentication.| Keys only.
54-
Content safety | Use Azure Content Safety service APIs. | Azure AI Content Safety filters are available integrated with inference APIs. Azure AI Content Safety filters may be billed separately.
55-
Network isolation | [Configure managed networks for Azure AI Studio hubs.](configure-managed-network.md) | MaaS endpoint will follow your hub's public network access (PNA) flag setting. For more information, see the [Network isolation for models deployed via Serverless APIs](#network-isolation-for-models-deployed-via-serverless-apis) section.
54+
Content safety | Use Azure Content Safety service APIs. | Azure AI Content Safety filters are available integrated with inference APIs. Azure AI Content Safety filters is billed separately.
55+
Network isolation | [Configure managed networks for Azure AI Studio hubs.](configure-managed-network.md) | Endpoints will follow your hub's public network access (PNA) flag setting. For more information, see the [Network isolation for models deployed via Serverless APIs](#network-isolation-for-models-deployed-via-serverless-apis) section.
5656

5757
Model | Managed compute | Serverless API (pay-as-you-go)
5858
--|--|--
@@ -100,25 +100,19 @@ Prompt flow offers a great experience for prototyping. You can use models deploy
100100

101101
## Serverless APIs with Pay-as-you-go billing
102102

103-
Certain models in the Model Catalog can be deployed as serverless APIs with pay-as-you-go billing; this method of deployment is called Models-as-a Service (MaaS), providing a way to consume them as an API without hosting them on your subscription. Models available through MaaS are hosted in infrastructure managed by Microsoft, which enables API-based access to the model provider's model. API based access can dramatically reduce the cost of accessing a model and significantly simplify the provisioning experience. Most MaaS models come with token-based pricing.
103+
Certain models in the Model Catalog can be deployed as serverless APIs with pay-as-you-go billing, providing a way to consume them as an API without hosting them on your subscription. Models are hosted in infrastructure managed by Microsoft, which enables API-based access to the model provider's model. API based access can dramatically reduce the cost of accessing a model and significantly simplify the provisioning experience.
104104

105-
### How are third-party models made available in MaaS?
105+
Models that are available for deployment as serverless APIs with pay-as-you-go billing are offered by the model provider but hosted in Microsoft-managed Azure infrastructure and accessed via API. Model providers define the license terms and set the price for use of their models, while Azure Machine Learning service manages the hosting infrastructure, makes the inference APIs available, and acts as the data processor for prompts submitted and content output by models deployed via MaaS. Learn more about data processing for MaaS at the [data privacy](concept-data-privacy.md) article.
106106

107107
:::image type="content" source="../media/explore/model-publisher-cycle.png" alt-text="A diagram showing model publisher service cycle." lightbox="../media/explore/model-publisher-cycle.png":::
108108

109-
Models that are available for deployment as serverless APIs with pay-as-you-go billing are offered by the model provider but hosted in Microsoft-managed Azure infrastructure and accessed via API. Model providers define the license terms and set the price for use of their models, while Azure Machine Learning service manages the hosting infrastructure, makes the inference APIs available, and acts as the data processor for prompts submitted and content output by models deployed via MaaS. Learn more about data processing for MaaS at the [data privacy](concept-data-privacy.md) article.
110-
111-
### Pay for model usage in MaaS
109+
### Billing
112110

113111
The discovery, subscription, and consumption experience for models deployed via MaaS is in the Azure AI Studio and Azure Machine Learning studio. Users accept license terms for use of the models, and pricing information for consumption is provided during deployment. Models from third party providers are billed through Azure Marketplace, in accordance with the [Commercial Marketplace Terms of Use](/legal/marketplace/marketplace-terms); models from Microsoft are billed using Azure meters as First Party Consumption Services. As described in the [Product Terms](https://www.microsoft.com/licensing/terms/welcome/welcomepage), First Party Consumption Services are purchased using Azure meters but aren't subject to Azure service terms; use of these models is subject to the license terms provided.
114112

115-
### Deploy models for inference through MaaS
116-
117-
Deploying a model through MaaS allows users to get access to ready to use inference APIs without the need to configure infrastructure or provision GPUs, saving engineering time and resources. These APIs can be integrated with several LLM tools and usage is billed as described in the previous section.
118-
119-
### Fine-tune models through MaaS with Pay-as-you-go
113+
### Fine-tune models
120114

121-
For models that are available through MaaS and support fine-tuning, users can take advantage of hosted fine-tuning with pay-as-you-go billing to tailor the models using data they provide. For more information, see the [fine-tuning overview](../concepts/fine-tuning-overview.md).
115+
Certain models support also serverless fine-tuning where users can take advantage of hosted fine-tuning with pay-as-you-go billing to tailor the models using data they provide. For more information, see the [fine-tuning overview](../concepts/fine-tuning-overview.md).
122116

123117
### RAG with models deployed as serverless APIs
124118

0 commit comments

Comments
 (0)