You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-model-catalog.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,19 +33,19 @@ Models are organized by Collections in the model catalog. There are three types
33
33
34
34
For information on Azure OpenAI models, refer to [Azure OpenAI Service](/azure/ai-services/openai/overview).
35
35
36
-
For models **Curated by Azure AI** and **Open models from the Hugging Face hub**, some of these can be deployed with a managed compute option, and some of these are available to be deployed using standard deployments. These models can be discovered, compared, evaluated, fine-tuned (when supported) and deployed at scale and integrated into your Generative AI applications with enterprise-grade security and data governance.
36
+
For models **Curated by Azure AI** and **Open models from the Hugging Face hub**, some of these can be deployed with a managed compute option, and some of these are available to be deployed using standard deployments with pay-as-you-go billing. These models can be discovered, compared, evaluated, fine-tuned (when supported) and deployed at scale and integrated into your Generative AI applications with enterprise-grade security and data governance.
37
37
38
38
***Discover:** Review model cards, try sample inference and browse code samples to evaluate, fine-tune, or deploy the model.
39
39
***Compare:** Compare benchmarks across models and datasets available in the industry to assess which one meets your business scenario.
40
40
***Evaluate:** Evaluate if the model is suited for your specific workload by providing your own test data. Evaluation metrics make it easy to visualize how well the selected model performed in your scenario.
41
41
***Fine-tune:** Customize fine-tunable models using your own training data and pick the best model by comparing metrics across all your fine-tuning jobs. Built-in optimizations speed up fine-tuning and reduce the memory and compute needed for fine-tuning.
42
42
***Deploy:** Deploy pretrained models or fine-tuned models seamlessly for inference. Models that can be deployed to managed compute can also be downloaded.
43
43
44
-
## Model deployment: Managed compute and standard deployment
44
+
## Model deployment: Managed compute and standard deployment with pay-as-you-go billing
45
45
46
46
Model Catalog offers two distinct ways to deploy models from the catalog for your use: managed compute and standard deployments. The deployment options available for each model vary; learn more about the features of the deployment options, and the options available for specific models, in the tables below. Learn more about [data processing](concept-data-privacy.md) with the deployment options.
47
47
48
-
Features | Managed compute | Standard deployment
48
+
Features | Managed compute | standard deployment (pay-as-you-go)
49
49
--|--|--
50
50
Deployment experience and billing | Model weights are deployed to dedicated Virtual Machines with managed online endpoints. The managed online endpoint, which can have one or more deployments, makes available a REST API for inference. You're billed for the Virtual Machine core hours used by the deployments. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model hosted in a central GPU pool, managed by Microsoft, for inference. This mode of access is referred to as "standard deployment". You're billed for inputs and outputs to the APIs, typically in tokens; pricing information is provided before you deploy.
51
51
| API authentication | Keys and Microsoft Entra ID authentication. [Learn more.](concept-endpoints-online-auth.md) | Keys only.
@@ -106,15 +106,15 @@ Prompt flow offers capabilities for prototyping, experimenting, iterating, and d
106
106
For models not available in the model catalog, Azure Machine Learning provides an open and extensible platform for working with models of your choice. You can bring a model with any framework or runtime using Azure Machine Learning's open and extensible platform capabilities such as [Azure Machine Learning environments](concept-environments.md) for containers that can package frameworks and runtimes and [Azure Machine Learning pipelines](concept-ml-pipelines.md) for code to evaluate or fine-tune the models. Refer to this notebook for sample reference to import models and work with the [built-in runtimes and pipelines](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/import/import_model_into_registry.ipynb).
107
107
108
108
109
-
## Standard deployments
109
+
## standard deployments with Pay-as-you-go billing
110
110
111
-
Certain models in the model catalog can be deployed as standard deployments; this method of deployment is calledstandard deployment. Models available through standard deployment are hosted in infrastructure managed by Microsoft, which enables API-based access to the model provider's model. API based access can dramatically reduce the cost of accessing a model and significantly simplify the provisioning experience. Most standard deployment models come with token-based pricing.
111
+
Certain models in the model catalog can be deployed as standard deployments with pay-as-you-go billing; this method of deployment is calledstandard deployment. Models available through standard deployment are hosted in infrastructure managed by Microsoft, which enables API-based access to the model provider's model. API based access can dramatically reduce the cost of accessing a model and significantly simplify the provisioning experience. Most standard deployment models come with token-based pricing.
112
112
113
113
### How are third-party models made available in standard deployment?
114
114
115
115
:::image type="content" source="media/concept-model-catalog/model-publisher-cycle.png" alt-text="A diagram showing model publisher service cycle." lightbox="media/concept-model-catalog/model-publisher-cycle.png":::
116
116
117
-
Models that are available for deployment as standard deployments are offered by the model provider but hosted in Microsoft-managed Azure infrastructure and accessed via API. Model providers define the license terms and set the price for use of their models, while Azure Machine Learning service manages the hosting infrastructure, makes the inference APIs available, and acts as the data processor for prompts submitted and content output by models deployed via standard deployment. Learn more about data processing for standard deployment at the [data privacy](concept-data-privacy.md) article.
117
+
Models that are available for deployment as standard deployments with pay-as-you-go billing are offered by the model provider but hosted in Microsoft-managed Azure infrastructure and accessed via API. Model providers define the license terms and set the price for use of their models, while Azure Machine Learning service manages the hosting infrastructure, makes the inference APIs available, and acts as the data processor for prompts submitted and content output by models deployed via standard deployment. Learn more about data processing for standard deployment at the [data privacy](concept-data-privacy.md) article.
118
118
119
119
### Pay for model usage in standard deployment
120
120
@@ -134,7 +134,7 @@ Azure AI Foundry enables users to make use of Vector Indexes and Retrieval Augme
134
134
135
135
### Regional availability of offers and models
136
136
137
-
Standard deployment is available only to users whose Azure subscription belongs to a billing account in a country/region where the model provider has made the offer available. If the offer is available in the relevant region, the user then must have a Hub/Project in the Azure region where the model is available for deployment or fine-tuning, as applicable. See [Region availability for models in standard deployment](concept-endpoint-serverless-availability.md) for detailed information.
137
+
PPay-as-you-go billing is available only to users whose Azure subscription belongs to a billing account in a country/region where the model provider has made the offer available. If the offer is available in the relevant region, the user then must have a Hub/Project in the Azure region where the model is available for deployment or fine-tuning, as applicable. See [Region availability for models in standard deployment](concept-endpoint-serverless-availability.md) for detailed information.
0 commit comments