You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/concepts/endpoints.md
+22-28Lines changed: 22 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Endpoint for Azure AI Foundry Models
2
+
title: Endpoints for Azure AI Foundry Models
3
3
titleSuffix: Azure AI Foundry
4
4
description: Learn about the Azure AI Foundry Models endpoint
5
5
author: santiagxf
@@ -11,7 +11,7 @@ ms.author: fasantia
11
11
ms.custom: ignite-2024, github-universe-2024
12
12
---
13
13
14
-
# Endpoint for Azure AI Foundry Models
14
+
# Endpoints for Azure AI Foundry Models
15
15
16
16
Azure AI Foundry Models allows customers to consume the most powerful models from flagship model providers using a single endpoint and credentials. This means that you can switch between models and consume them from your application without changing a single line of code.
17
17
@@ -36,19 +36,21 @@ An Azure AI Foundry resource can have as many model deployments as needed and th
36
36
37
37
To learn more about how to create deployments see [Add and configure model deployments](../how-to/create-model-deployments.md).
38
38
39
-
## Foundry Models inference endpoint
39
+
## Endpoints
40
40
41
-
The Foundry Models inference endpoint allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. This endpoint follows the [Foundry Models API](.././reference/reference-model-inference-api.md) which all the models in Foundry Models support. It supports the following modalities:
41
+
Azure AI Foundry Services (formerly known Azure AI Services) expose multiple endpoints depending on the type of work you're looking for:
42
42
43
-
* Text embeddings
44
-
* Image embeddings
45
-
* Chat completions
43
+
> [!div class="checklist"]
44
+
> * Azure AI inference endpoint (usually with the form `https://<resource-name>.services.ai.azure.com/models`)
45
+
> * Azure OpenAI endpoint (usually with the form `https://<resource-name>.openai.azure.com`)
46
+
47
+
The **Azure AI inference endpoint** allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. All the models support this capability. This endpoint follows the [Azure AI Model Inference API](.././reference/reference-model-inference-api.md).
46
48
47
-
You can see the endpoint URL and credentials in the **Overview** section:
49
+
The **Azure OpenAI API** exposes the full capabilities of OpenAI models and supports more features like assistants, threads, files, and batch inference. Non-OpenAI models may also be exposed in this route.
48
50
49
-
:::image type="content" source="../media/overview/overview-endpoint-and-key.png" alt-text="Screenshot showing how to get the URL and key associated with the resource." lightbox="../media/overview/overview-endpoint-and-key.png":::
51
+
To learn more about how to apply the **Azure OpenAI endpoint** see [Azure OpenAI in Azure AI Foundry Models documentation](../../../ai-services/openai/overview.md).
50
52
51
-
### Routing
53
+
##Using Azure AI inference endpoint
52
54
53
55
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allows you to deploy a given model multiple times in the service but under different configurations if needed.
54
56
@@ -58,32 +60,24 @@ For example, if you create a deployment named `Mistral-large`, then such deploym
For a chat model, you can create a request as follows:
65
64
66
-
### SDKs
67
-
68
-
The Foundry Models endpoint is supported by multiple SDKs, including the **Azure AI Inference SDK**, the **Azure AI Foundry SDK**, and the **Azure OpenAI SDK**; which are available in multiple languages. Multiple integrations are also supported in popular frameworks like LangChain, LangGraph, Llama-Index, Semantic Kernel, and AG2. See [supported programming languages and SDKs](../supported-languages.md) for details.
69
-
70
-
## Azure OpenAI inference endpoint
71
-
72
-
Azure OpenAI models deployed to AI services also support the Azure OpenAI API. This API exposes the full capabilities of OpenAI models and supports additional features like assistants, threads, files, and batch inference.
Azure OpenAI inference endpoints work at the deployment level and they have their own URL that is associated with each of them. However, the same authentication mechanism can be used to consume them. Learn more in the reference page for [Azure OpenAI API](../../../ai-services/openai/reference.md)
67
+
If you specify a model name that doesn't match any given model deployment, you get an error that the model doesn't exist. You can control which models are available for users by creating model deployments as explained at [add and configure model deployments](../how-to/create-model-deployments.md).
75
68
76
-
:::image type="content" source="../media/endpoint/endpoint-openai.png" alt-text="An illustration showing how Azure OpenAI deployments contain a single URL for each deployment." lightbox="../media/endpoint/endpoint-openai.png":::
69
+
## Key-less authentication
77
70
78
-
Each deployment has a URL that is the concatenations of the **Azure OpenAI** base URL and the route `/deployments/<model-deployment-name>`.
71
+
Models deployed to Azure AI Foundry Models in Azure AI Services support key-less authorization using Microsoft Entra ID. Key-less authorization enhances security, simplifies the user experience, reduces operational complexity, and provides robust compliance support for modern development. It makes it a strong choice for organizations adopting secure and scalable identity management solutions.
79
72
80
-
> [!IMPORTANT]
81
-
> There's no routing mechanism for the Azure OpenAI endpoint, as each URL is exclusive for each model deployment.
73
+
To use key-less authentication, [configure your resource and grant access to users](../how-to/configure-entra-id.md) to perform inference. Once configured, then you can authenticate as follows:
The Azure OpenAI endpoint is supported by the **OpenAI SDK (`AzureOpenAI` class)** and **Azure OpenAI SDKs**, which are available in multiple languages. See [supported languages](../supported-languages.md#azure-openai-models) for details.
77
+
## Limitations
86
78
79
+
* Azure OpenAI Batch can't be used with the Foundry Models endpoint. You have to use the dedicated deployment URL as explained at [Batch API support in Azure OpenAI documentation](../../../ai-services/openai/how-to/batch.md#api-support).
80
+
* Real-time API isn't supported in the inference endpoint. Use the dedicated deployment URL.
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/faq.yml
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -17,9 +17,9 @@ sections:
17
17
- name: General
18
18
questions:
19
19
- question: |
20
-
What's the difference between Azure OpenAI and Foundry Models?
20
+
What's the difference between Azure OpenAI and Azure AI Foundry Models?
21
21
answer: |
22
-
Azure OpenAI gives customers access to advanced language models from OpenAI. Foundry Models extends such capability giving customers access to all the flagship models in Azure AI Foundry under the same service, endpoint, and credentials. It includes Azure OpenAI, Cohere, Mistral AI, Meta Llama, AI21 labs, etc. Customers can seamlessly switch between models without changing their code.
22
+
Azure OpenAI gives customers access to advanced language models from OpenAI. Azure AI Foundry Models extends such capability giving customers access to all the flagship models in Azure AI Foundry under the same service, endpoint, and credentials. It includes Azure OpenAI, Cohere, Mistral AI, Meta Llama, AI21 labs, etc. Customers can seamlessly switch between models without changing their code.
23
23
24
24
Azure OpenAI is included in the Models Sold Directly by Azure family in Foundry Models.
Copy file name to clipboardExpand all lines: articles/ai-foundry/model-inference/how-to/inference.md
+21-27Lines changed: 21 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: How to use the Azure AI Foundry Models inference endpoint to consume models
2
+
title: How to use the Azure AI Foundry Models inference endpoints to consume models
3
3
titleSuffix: Azure AI Foundry
4
4
description: Learn how to use the Azure AI Foundry Models inference endpoint to consume models
5
5
manager: scottpolly
@@ -12,27 +12,19 @@ ms.author: mopeakande
12
12
ms.reviewer: fasantia
13
13
---
14
14
15
-
# Use the Azure AI Foundry Models inference endpoints
15
+
# Use Foundry Models
16
16
17
-
Azure AI Foundry Models allows customers to consume the most powerful models from flagship model providers using a single endpoint and credentials. This means that you can switch between models and consume them from your application without changing a single line of code.
17
+
Once you have [deployed a model in Azure AI Foundry](create-model-deployments.md), you can consume its capabilities via Azure AI Foundry APIs. There are two different endpoints and APIs to use models in Azure AI Foundry Models.
18
18
19
-
This article explains how to use the inference endpoint to invoke them.
19
+
## Models inference endpoint
20
20
21
-
## Endpoints
21
+
The models inference endpoint (usually with the form `https://<resource-name>.services.ai.azure.com/models`) allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. This endpoint follows the [Azure AI Model Inference API](.././reference/reference-model-inference-api.md) which all the models in Foundry Models support. It supports the following modalities:
22
22
23
-
Azure AI Foundry Services (formerly known Azure AI Services) expose multiple endpoints depending on the type of work you're looking for:
23
+
* Text embeddings
24
+
* Image embeddings
25
+
* Chat completions
24
26
25
-
> [!div class="checklist"]
26
-
> * Foundry Models endpoint
27
-
> * Azure OpenAI endpoint
28
-
29
-
The **Azure AI inference endpoint** (usually with the form `https://<resource-name>.services.ai.azure.com/models`) allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. All the models support this capability. This endpoint follows the [Foundry Models API](.././reference/reference-model-inference-api.md).
30
-
31
-
**Azure OpenAI** models deployed to AI services also support the Azure OpenAI API (usually with the form `https://<resource-name>.openai.azure.com`). This endpoint exposes the full capabilities of OpenAI models and supports more features like assistants, threads, files, and batch inference.
32
-
33
-
To learn more about how to apply the **Azure OpenAI endpoint** see [Azure OpenAI in Azure AI Foundry Models documentation](../../../ai-services/openai/overview.md).
34
-
35
-
## Using the routing capability in the Foundry Models endpoint
27
+
### Routing
36
28
37
29
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allows you to deploy a given model multiple times in the service but under different configurations if needed.
38
30
@@ -42,24 +34,26 @@ For example, if you create a deployment named `Mistral-large`, then such deploym
If you specify a model name that doesn't match any given model deployment, you get an error that the model doesn't exist. You can control which models are available for users by creating model deployments as explained at [add and configure model deployments](create-model-deployments.md).
39
+
> [!TIP]
40
+
> Deployment routing isn't case sensitive.
41
+
42
+
43
+
## Azure OpenAI inference endpoint
44
+
45
+
Azure AI Foundry also support the Azure OpenAI API. This API exposes the full capabilities of OpenAI models and supports additional features like assistants, threads, files, and batch inference. Non-OpenAI models can also be used for compatible functionalities.
50
46
51
-
## Key-less authentication
47
+
Azure OpenAI endpoints (usually with the form `https://<resource-name>.openai.azure.com`) work at the deployment level and they have their own URL that is associated with each of them. However, the same authentication mechanism can be used to consume them. Learn more in the reference page for [Azure OpenAI API](../../../ai-services/openai/reference.md)
52
48
53
-
Models deployed to Azure AI Foundry Models in Azure AI Services support key-less authorization using Microsoft Entra ID. Key-less authorization enhances security, simplifies the user experience, reduces operational complexity, and provides robust compliance support for modern development. It makes it a strong choice for organizations adopting secure and scalable identity management solutions.
49
+
:::image type="content" source="../media/endpoint/endpoint-openai.png" alt-text="An illustration showing how Azure OpenAI deployments contain a single URL for each deployment." lightbox="../media/endpoint/endpoint-openai.png":::
54
50
55
-
To use key-less authentication, [configure your resource and grant access to users](configure-entra-id.md) to perform inference. Once configured, then you can authenticate as follows:
51
+
Each deployment has a URL that is the concatenations of the **Azure OpenAI** base URL and the route `/deployments/<model-deployment-name>`.
* Azure OpenAI Batch can't be used with the Foundry Models endpoint. You have to use the dedicated deployment URL as explained at [Batch API support in Azure OpenAI documentation](../../../ai-services/openai/how-to/batch.md#api-support).
62
-
* Real-time API isn't supported in the inference endpoint. Use the dedicated deployment URL.
> *[Key-less authentication](configure-entra-id.md) with role-based access control.
@@ -79,7 +79,7 @@ To configure the project to use the Foundry Models capability in Azure AI Foundr
79
79
:::image type="content" source="../media/quickstart-ai-project/overview-endpoint-and-key.png" alt-text="Screenshot of the landing page for the project, highlighting the location of the connected resource and the associated inference endpoint." lightbox="../media/quickstart-ai-project/overview-endpoint-and-key.png":::
80
80
81
81
> [!TIP]
82
-
> Each Azure AI Foundry Services resource has a single **Foundry Models endpoint** which can be used to access any model deployment on it. The same endpoint serves multiple models depending on which ones are configured. Learn about [how the endpoint works](../concepts/endpoints.md#azure-openai-inference-endpoint).
82
+
> Each Azure AI Foundry Services resource has a single **Foundry Models endpoint** which can be used to access any model deployment on it. The same endpoint serves multiple models depending on which ones are configured. Learn about [how the endpoint works](inference.md#azure-openai-inference-endpoint).
83
83
84
84
5. Take note of the endpoint URL and credentials.
85
85
@@ -136,7 +136,7 @@ Generate your first chat completion:
Use the parameter `model="<deployment-name>` to route your request to this deployment. *Deployments work as an alias of a given model under certain configurations*. See [Routing](../concepts/endpoints.md#routing)concept page to learn how Azure AI Services route deployments.
139
+
Use the parameter `model="<deployment-name>` to route your request to this deployment. *Deployments work as an alias of a given model under certain configurations*. See [Routing](inference.md#routing) page to learn how Azure AI Foundry Models routes deployments.
140
140
141
141
142
142
## Move from standard deployments to Foundry Models
Use the parameter `model="<deployment-name>` to route your request to this deployment. *Deployments work as an alias of a given model under certain configurations*. See [Routing](../concepts/endpoints.md#routing) concept page to learn how Azure AI Services route deployments.
78
+
Use the parameter `model="<deployment-name>` to route your request to this deployment. *Deployments work as an alias of a given model under certain configurations*. See [Routing](inference.md#routing) concept page to learn how Azure AI Services route deployments.
79
79
80
80
> [!IMPORTANT]
81
81
> As opposite to GitHub Models where all the models are already configured, the Azure AI Services resource allows you to control which models are available in your endpoint and under which configuration. Add as many models as you plan to use before indicating them in the `model` parameter. Learn how to [add more models](create-model-deployments.md) to your resource.
0 commit comments