Skip to content

Commit 5b9caaf

Browse files
committed
name changes models as a service
1 parent 3b6964b commit 5b9caaf

File tree

9 files changed

+13
-13
lines changed

9 files changed

+13
-13
lines changed

articles/ai-foundry/concepts/deployments-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ The model catalog in Azure AI Foundry portal is the hub to discover and use a wi
2020
Deployment options vary depending on the model offering:
2121

2222
* **Azure OpenAI in Azure AI Foundry Models:** The latest OpenAI models that have enterprise features from Azure with flexible billing options.
23-
* **Standard deployment:** These models don't require compute quota from your subscription and are billed per token in a pay-as-you-go fashion.
23+
* **Standard deployment:** These models don't require compute quota from your subscription and are billed per token in a serverless pay per token offer.
2424
* **Open and custom models:** The model catalog offers access to a large variety of models across modalities, including models of open access. You can host open models in your own subscription with a managed infrastructure, virtual machines, and the number of instances for capacity management.
2525

2626
Azure AI Foundry offers four different deployment options:
@@ -39,7 +39,7 @@ Azure AI Foundry offers four different deployment options:
3939
| Billing bases | Token usage & [provisioned throughput units](../../ai-services/openai/concepts/provisioned-throughput.md) | Token usage | Token usage<sup>1</sup> | Compute core hours<sup>2</sup> |
4040
| Deployment instructions | [Deploy to Azure OpenAI](../how-to/deploy-models-openai.md) | [Deploy to Azure AI model inference](../model-inference/how-to/create-model-deployments.md) | [Deploy to Standard deployment](../how-to/deploy-models-serverless.md) | [Deploy to Managed compute](../how-to/deploy-models-managed.md) |
4141

42-
<sup>1</sup> A minimal endpoint infrastructure is billed per minute. You aren't billed for the infrastructure that hosts the model in pay-as-you-go. After you delete the endpoint, no further charges accrue.
42+
<sup>1</sup> A minimal endpoint infrastructure is billed per minute. You aren't billed for the infrastructure that hosts the model in standard deployment. After you delete the endpoint, no further charges accrue.
4343

4444
<sup>2</sup> Billing is on a per-minute basis, depending on the product tier and the number of instances used in the deployment since the moment of creation. After you delete the endpoint, no further charges accrue.
4545

articles/ai-foundry/concepts/models-featured.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -250,8 +250,8 @@ See [the Microsoft model collection in Azure AI Foundry portal](https://ai.azure
250250

251251
Mistral AI offers two categories of models, namely:
252252

253-
- _Premium models_: These include Mistral Large, Mistral Small, Mistral-OCR-2503, and Ministral 3B models, and are available as standard deployments with pay-as-you-go token-based billing.
254-
- _Open models_: These include Mistral-small-2503, Codestral, and Mistral Nemo (that are available as standard deployments with pay-as-you-go token-based billing), and [Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01](../how-to/deploy-models-mistral-open.md)(that are available to download and run on self-hosted managed endpoints).
253+
- _Premium models_: These include Mistral Large, Mistral Small, Mistral-OCR-2503, and Ministral 3B models, and are available as standard deployments with serverless pay per token offer.
254+
- _Open models_: These include Mistral-small-2503, Codestral, and Mistral Nemo (that are available as standard deployments with serverless pay per token offer), and [Mixtral-8x7B-Instruct-v01, Mixtral-8x7B-v01, Mistral-7B-Instruct-v01, and Mistral-7B-v01](../how-to/deploy-models-mistral-open.md)(that are available to download and run on self-hosted managed endpoints).
255255

256256

257257
| Model | Type | Capabilities |

articles/ai-foundry/faq.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ sections:
5353
- question: |
5454
What is the billing model for standard deployments?
5555
answer: |
56-
Azure AI Foundry offers pay-as-you-go inference APIs and hosted fine-tuning for [Llama 2 family models](how-to/deploy-models-llama.md). Currently, there's no extra charge for Azure AI Foundry outside of typical AI services and other Azure resource charges.
56+
Azure AI Foundry offers standard deployment models and hosted fine-tuning for [Llama 2 family models](how-to/deploy-models-llama.md). Currently, there's no extra charge for Azure AI Foundry outside of typical AI services and other Azure resource charges.
5757
- question: |
5858
Can all models be secured with content filtering?
5959
answer: |

articles/ai-foundry/how-to/concept-data-privacy.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Although containers for **Curated by Azure AI** models are scanned for vulnerabi
4141

4242
## Generation of inferencing outputs as a standard deployment
4343

44-
When you deploy a model from the model catalog (base or fine-tuned) by using standard deployments with pay-as-you-go billing for inferencing, an API is provisioned. The API gives you access to the model that the Azure Machine Learning service hosts and manages. Learn more about standard deployments in [Model catalog and collections](./model-catalog-overview.md).
44+
When you deploy a model from the model catalog (base or fine-tuned) by using standard deployments with serverless pay per token offer for inferencing, an API is provisioned. The API gives you access to the model that the Azure Machine Learning service hosts and manages. Learn more about standard deployments in [Model catalog and collections](./model-catalog-overview.md).
4545

4646
The model processes your input prompts and generates outputs based on its functionality, as described in the model details. Your use of the model (along with the provider's accountability for the model and its outputs) is subject to the license terms for the model. Microsoft provides and manages the hosting infrastructure and API endpoint. The models hosted in this *standard deployment* scenario are subject to Azure data, privacy, and security commitments. [Learn more about Azure compliance offerings applicable to Azure AI Foundry](https://servicetrust.microsoft.com/DocumentPage/7adf2d9e-d7b5-4e71-bad8-713e6a183cf3).
4747

articles/ai-foundry/how-to/configure-private-link.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -374,7 +374,7 @@ If you need to configure custom DNS server without DNS forwarding, use the follo
374374
* `<instance-name>-22.<region>.instances.azureml.ms` - Only used by the `az ml compute connect-ssh` command to connect to computers in a managed virtual network. Not needed if you aren't using a managed network or SSH connections.
375375
376376
* `<managed online endpoint name>.<region>.inference.ml.azure.com` - Used by managed online endpoints
377-
* `models.ai.azure.com` - Used for deploying Models as a Service
377+
* `models.ai.azure.com` - Used for standard deployment
378378
379379
To find the private IP addresses for your A records, see the [Azure Machine Learning custom DNS](/azure/machine-learning/how-to-custom-dns#find-the-ip-addresses) article.
380380

articles/ai-foundry/how-to/model-catalog-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ To view a list of supported models for standard deployment or Managed Compute, g
8282

8383
<!-- docutune:enable -->
8484

85-
:::image type="content" source="../media/explore/platform-service-cycle.png" alt-text="Diagram that shows models as a service and the service cycle of managed computes." lightbox="../media/explore/platform-service-cycle.png":::
85+
:::image type="content" source="../media/explore/platform-service-cycle.png" alt-text="Diagram that shows a standard deployment model and the service cycle of managed computes." lightbox="../media/explore/platform-service-cycle.png":::
8686

8787
## Model lifecycle: deprecation and retirement
8888
AI models evolve fast, and when a new version or a new model with updated capabilities in the same model family become available, older models may be retired in the AI Foundry model catalog. To allow for a smooth transition to a newer model version, some models provide users with the option to enable automatic updates. To learn more about the model lifecycle of different models, upcoming model retirement dates, and suggested replacement models and versions, see:

articles/ai-foundry/model-inference/how-to/quickstart-ai-project.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ recommendations: false
1616

1717
If you already have an AI project in Azure AI Foundry, the model catalog deploys models from third-party model providers as stand-alone endpoints in your project by default. Each model deployment has its own set of URI and credentials to access it. On the other hand, Azure OpenAI models are deployed to Azure AI Services resource or to the Azure OpenAI Service resource.
1818

19-
You can change this behavior and deploy both types of models to Azure AI Foundry resources (formerly known Azure AI Services). Once configured, **deployments of Models as a Service models supporting pay-as-you-go billing happen to the connected Azure AI Services resource** instead to the project itself, giving you a single set of endpoint and credential to access all the models deployed in Azure AI Foundry. You can manage Azure OpenAI and third-party model providers models in the same way.
19+
You can change this behavior and deploy both types of models to Azure AI Foundry resources (formerly known Azure AI Services). Once configured, **deployments of models as a standard deployment happen to the connected Azure AI Services resource** instead to the project itself, giving you a single set of endpoint and credential to access all the models deployed in Azure AI Foundry. You can manage Azure OpenAI and third-party model providers models in the same way.
2020

2121
Additionally, deploying models to Azure AI Foundry Models brings the extra benefits of:
2222

@@ -188,7 +188,7 @@ For each model deployed as standard deployments, follow these steps:
188188

189189
Consider the following limitations when configuring your project to use Azure AI model inference:
190190

191-
* Only models supporting pay-as-you-go billing (Models as a Service) are available for deployment to Azure AI model inference. Models requiring compute quota from your subscription (Managed Compute), including custom models, can only be deployed within a given project as Managed Online Endpoints and continue to be accessible using their own set of endpoint URI and credentials.
191+
* Only models supporting pay-as-you-go billing (standard deployment) are available for deployment to Azure AI model inference. Models requiring compute quota from your subscription (Managed Compute), including custom models, can only be deployed within a given project as Managed Online Endpoints and continue to be accessible using their own set of endpoint URI and credentials.
192192
* Models available as both pay-as-you-go billing and managed compute offerings are, by default, deployed to Azure AI model inference in Azure AI services resources. Azure AI Foundry portal doesn't offer a way to deploy them to Managed Online Endpoints. You have to turn off the feature mentioned at [Configure the project to use Azure AI model inference](#configure-the-project-to-use-azure-ai-model-inference) or use the Azure CLI/Azure ML SDK/ARM templates to perform the deployment.
193193

194194
## Next steps

articles/machine-learning/concept-model-catalog.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ Model Catalog offers two distinct ways to deploy models from the catalog for you
4747

4848
Features | Managed compute | standard deployment (pay-as-you-go)
4949
--|--|--
50-
Deployment experience and billing | Model weights are deployed to dedicated Virtual Machines with managed online endpoints. The managed online endpoint, which can have one or more deployments, makes available a REST API for inference. You're billed for the Virtual Machine core hours used by the deployments. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model hosted in a central GPU pool, managed by Microsoft, for inference. This mode of access is referred to as "Models as a Service". You're billed for inputs and outputs to the APIs, typically in tokens; pricing information is provided before you deploy.
50+
Deployment experience and billing | Model weights are deployed to dedicated Virtual Machines with managed online endpoints. The managed online endpoint, which can have one or more deployments, makes available a REST API for inference. You're billed for the Virtual Machine core hours used by the deployments. | Access to models is through a deployment that provisions an API to access the model. The API provides access to the model hosted in a central GPU pool, managed by Microsoft, for inference. This mode of access is referred to as "standard deployment". You're billed for inputs and outputs to the APIs, typically in tokens; pricing information is provided before you deploy.
5151
| API authentication | Keys and Microsoft Entra ID authentication. [Learn more.](concept-endpoints-online-auth.md) | Keys only.
5252
Content safety | Use Azure Content Safety service APIs. | Azure AI Content Safety filters are available integrated with inference APIs. Azure AI Content Safety filters may be billed separately.
5353
Network isolation | Managed Virtual Network with Online Endpoints. [Learn more.](how-to-network-isolation-model-catalog.md) |
@@ -64,7 +64,7 @@ Phi-3 family models | Phi-3-mini-4k-Instruct <br> Phi-3-mini-128k-Instruct <br>
6464
Nixtla | Not available | TimeGEN-1
6565
Other models | Available | Not available
6666

67-
:::image type="content" source="./media/concept-model-catalog/platform-service-cycle.png" alt-text="A diagram showing models as a service and Real time end points service cycle." lightbox="media/concept-model-catalog/platform-service-cycle.png":::
67+
:::image type="content" source="./media/concept-model-catalog/platform-service-cycle.png" alt-text="A diagram showing standard deployment and Real time end points service cycle." lightbox="media/concept-model-catalog/platform-service-cycle.png":::
6868

6969
## Managed compute
7070

articles/machine-learning/how-to-custom-dns.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ The following FQDNs are for Microsoft Azure operated by 21Vianet regions:
143143

144144
* `<instance-name>-22.<region>.instances.azureml.cn` - Only used by the `az ml compute connect-ssh` command to connect to computes in a private virtual network. Not needed if you aren't using a managed network or SSH connections.
145145
* `<managed online endpoint name>.<region>.inference.ml.azure.cn` - Used by managed online endpoints
146-
* `models.ai.azure.com` - Used for deploying Models as a Service
146+
* `models.ai.azure.com` - Used for standard deployment
147147

148148
#### Azure US Government
149149

0 commit comments

Comments
 (0)