Skip to content

Commit 043b431

Browse files
authored
Merge pull request #2440 from eric-urban/eur/model-inference-PR-4
final set of new docs for model inference
2 parents 5d2598c + b80789d commit 043b431

File tree

17 files changed

+815
-10
lines changed

17 files changed

+815
-10
lines changed
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
title: 'How to configure content filters (preview) for models in Azure AI services'
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to use and configure the content filters that come with Azure AI Services, including getting approval for gated modifications.
5+
manager: nitinme
6+
ms.service: azure-ai-model-inference
7+
ms.topic: how-to
8+
ms.date: 1/21/2025
9+
author: mrbullwinkle
10+
ms.author: mbullwin
11+
recommendations: false
12+
ms.custom: ignite-2024, github-universe-2024
13+
zone_pivot_groups: azure-ai-models-deployment
14+
---
15+
16+
# How to configure content filters (preview) for models in Azure AI services
17+
18+
::: zone pivot="ai-foundry-portal"
19+
[!INCLUDE [portal](../includes/configure-content-filters/portal.md)]
20+
::: zone-end
21+
22+
::: zone pivot="programming-language-cli"
23+
[!INCLUDE [cli](../includes/configure-content-filters/cli.md)]
24+
::: zone-end
25+
26+
::: zone pivot="programming-language-bicep"
27+
[!INCLUDE [bicep](../includes/configure-content-filters/bicep.md)]
28+
::: zone-end
29+
30+
## Next steps
31+
32+
- Read more about [content filtering categories and severity levels](../concepts/content-filter.md) with Azure OpenAI Service.
33+
- Learn more about red teaming from our: [Introduction to red teaming large language models (LLMs) article](../../../ai-services/openai/concepts/red-teaming.md).
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
title: Configure a connection to use Azure AI model inference in your AI project
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to configure a connection to use Azure AI model inference in your project.
5+
ms.service: azure-ai-model-inference
6+
ms.topic: how-to
7+
ms.date: 1/21/2025
8+
ms.custom: ignite-2024, github-universe-2024
9+
manager: nitinme
10+
author: mrbullwinkle
11+
ms.author: fasantia
12+
recommendations: false
13+
zone_pivot_groups: azure-ai-models-deployment
14+
---
15+
16+
# Configure a connection to use Azure AI model inference in your AI project
17+
18+
::: zone pivot="ai-foundry-portal"
19+
[!INCLUDE [portal](../includes/configure-project-connection/portal.md)]
20+
::: zone-end
21+
22+
::: zone pivot="programming-language-cli"
23+
[!INCLUDE [cli](../includes/configure-project-connection/cli.md)]
24+
::: zone-end
25+
26+
::: zone pivot="programming-language-bicep"
27+
[!INCLUDE [bicep](../includes/configure-project-connection/bicep.md)]
28+
::: zone-end
29+
30+
## Next steps
31+
32+
* [Develop applications using Azure AI model inference service in Azure AI services](../supported-languages.md)
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
title: Add and configure models to Azure AI services
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to add and configure new models to the Azure AI model's inference endpoint in Azure AI services.
5+
ms.service: azure-ai-model-inference
6+
ms.topic: how-to
7+
ms.date: 1/21/2025
8+
ms.custom: ignite-2024, github-universe-2024
9+
manager: nitinme
10+
author: mrbullwinkle
11+
ms.author: fasantia
12+
recommendations: false
13+
zone_pivot_groups: azure-ai-models-deployment
14+
---
15+
16+
# Add and configure models to Azure AI model inference
17+
18+
::: zone pivot="ai-foundry-portal"
19+
[!INCLUDE [portal](../includes/create-model-deployments/portal.md)]
20+
::: zone-end
21+
22+
::: zone pivot="programming-language-cli"
23+
[!INCLUDE [cli](../includes/create-model-deployments/cli.md)]
24+
::: zone-end
25+
26+
::: zone pivot="programming-language-bicep"
27+
[!INCLUDE [bicep](../includes/create-model-deployments/bicep.md)]
28+
::: zone-end
29+
30+
## Next steps
31+
32+
* [Develop applications using Azure AI model inference service in Azure AI services](../supported-languages.md)
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
---
2+
title: Add and configure models to Azure AI model inference
3+
titleSuffix: Azure AI Foundry for GitHub
4+
description: Learn how to add and configure new models to the Azure AI model inference endpoint in Azure AI Foundry for GitHub.
5+
ms.service: azure-ai-model-inference
6+
ms.topic: how-to
7+
ms.date: 1/21/2025
8+
ms.custom: ignite-2024, github-universe-2024
9+
manager: nitinme
10+
author: mrbullwinkle
11+
ms.author: fasantia
12+
recommendations: false
13+
---
14+
15+
# Add and configure models to Azure AI services
16+
17+
You can decide and configure which models are available for inference in the Azure AI services resource model's inference endpoint. When a given model is configured, you can then generate predictions from it by indicating its model name or deployment name on your requests. No further changes are required in your code to use it.
18+
19+
In this article, you learn how to add a new model to Azure AI model inference.
20+
21+
## Prerequisites
22+
23+
To complete this article, you need:
24+
25+
* An Azure subscription. If you're using [GitHub Models](https://docs.github.com/en/github-models/), you can upgrade your experience and create an Azure subscription in the process. Read [Upgrade from GitHub Models to Azure AI model inference](../quickstart-github-models.md) if it's your case.
26+
* An Azure AI services resource. For more information, see [Create an Azure AI Services resource](../../../../ai-services/multi-service-resource.md?context=/azure/ai-services/model-inference/context/context).
27+
28+
## Add a model
29+
30+
[!INCLUDE [add-model-deployments](../../includes/github/add-model-deployments.md)]
31+
32+
## Use the model
33+
34+
Deployed models in Azure AI services can be consumed using the [Azure AI model's inference endpoint](../../concepts/endpoints.md) for the resource.
35+
36+
To use it:
37+
38+
1. Get the Azure AI model's inference endpoint URL and keys from the **deployment page** or the **Overview** page. If you're using Microsoft Entra ID authentication, you don't need a key.
39+
40+
2. When constructing your request, indicate the parameter `model` and insert the model deployment name you created.
41+
42+
[!INCLUDE [code-create-chat-completion](../../includes/code-create-chat-completion.md)]
43+
44+
3. When using the endpoint, you can change the `model` parameter to any available model deployment in your resource.
45+
46+
Additionally, Azure OpenAI models can be consumed using the [Azure OpenAI service endpoint](../../../../ai-services/openai/supported-languages.md) in the resource. This endpoint is exclusive for each model deployment and has its own URL.
47+
48+
## Model deployment customization
49+
50+
When creating model deployments, you can configure additional settings including content filtering and rate limits. Select the option **Customize** in the deployment wizard to configure it.
51+
52+
> [!NOTE]
53+
> Configurations may vary depending on the model you're deploying.
54+
55+
## Next steps
56+
57+
* [Develop applications using Azure AI model inference service in Azure AI services](../../supported-languages.md)
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
title: How to use the Azure AI model inference endpoint to consume models
3+
titleSuffix: Azure AI Foundry
4+
description: Learn how to use the Azure AI model inference endpoint to consume models
5+
manager: scottpolly
6+
author: msakande
7+
reviewer: santiagxf
8+
ms.service: azure-ai-model-inference
9+
ms.topic: how-to
10+
ms.date: 1/21/2025
11+
ms.author: mopeakande
12+
ms.reviewer: fasantia
13+
---
14+
15+
# Use the Azure AI model inference endpoint to consume models
16+
17+
Azure AI model inference in Azure AI services allows customers to consume the most powerful models from flagship model providers using a single endpoint and credentials. This means that you can switch between models and consume them from your application without changing a single line of code.
18+
19+
This article explains how to use the inference endpoint to invoke them.
20+
21+
## Endpoints
22+
23+
Azure AI services expose multiple endpoints depending on the type of work you're looking for:
24+
25+
> [!div class="checklist"]
26+
> * Azure AI model inference endpoint
27+
> * Azure OpenAI endpoint
28+
29+
The **Azure AI inference endpoint** allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. All the models support this capability. This endpoint follows the [Azure AI model inference API](../../../ai-studio/reference/reference-model-inference-api.md).
30+
31+
**Azure OpenAI** models deployed to AI services also support the Azure OpenAI API. This endpoint exposes the full capabilities of OpenAI models and supports more features like assistants, threads, files, and batch inference.
32+
33+
To learn more about how to apply the **Azure OpenAI endpoint** see [Azure OpenAI service documentation](../../../ai-services/openai/overview.md).
34+
35+
## Using the routing capability in the Azure AI model inference endpoint
36+
37+
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allows you to deploy a given model multiple times in the service but under different configurations if needed.
38+
39+
:::image type="content" source="../media/endpoint/endpoint-routing.png" alt-text="An illustration showing how routing works for a Meta-llama-3.2-8b-instruct model by indicating such name in the parameter 'model' inside of the payload request." lightbox="../media/endpoint/endpoint-routing.png":::
40+
41+
For example, if you create a deployment named `Mistral-large`, then such deployment can be invoked as:
42+
43+
[!INCLUDE [code-create-chat-client](../includes/code-create-chat-client.md)]
44+
45+
For a chat model, you can create a request as follows:
46+
47+
[!INCLUDE [code-create-chat-completion](../includes/code-create-chat-completion.md)]
48+
49+
If you specify a model name that doesn't match any given model deployment, you get an error that the model doesn't exist. You can control which models are available for users by creating model deployments as explained at [add and configure model deployments](create-model-deployments.md).
50+
51+
## Limitations
52+
53+
* Azure OpenAI Batch can't be used with the Azure AI model inference endpoint. You have to use the dedicated deployment URL as explained at [Batch API support in Azure OpenAI documentation](../../../ai-services/openai/how-to/batch.md#api-support).
54+
* Real-time API isn't supported in the inference endpoint. Use the dedicated deployment URL.
55+
56+
## Next steps
57+
58+
* [Use embedding models](use-embeddings.md)
59+
* [Use chat completion models](use-chat-completions.md)

0 commit comments

Comments
 (0)