You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure AI inference service in Azure AI services allow customers to consume the most powerful models from flagship model providers using a single endpoint and credentials. This means that you can switch between models and consume them from your application without changing a single line of code.
17
+
Azure AI inference service in Azure AI services allows customers to consume the most powerful models from flagship model providers using a single endpoint and credentials. This means that you can switch between models and consume them from your application without changing a single line of code.
18
18
19
19
The article explains how models are organized inside of the service and how to use the inference endpoint to invoke them.
20
20
21
21
## Deployments
22
22
23
-
Azure AI model inference service make models available using the **deployment** concept. **Deployments** are a way to give a model a name under certain configurations. Then, you can invoke such model configuration by indicating its name on your requests.
23
+
Azure AI model inference service makes models available using the **deployment** concept. **Deployments** are a way to give a model a name under certain configurations. Then, you can invoke such model configuration by indicating its name on your requests.
24
24
25
25
Deployments capture:
26
26
@@ -39,15 +39,15 @@ To learn more about how to create deployments see [Add and configure model deplo
39
39
40
40
## Azure AI inference endpoint
41
41
42
-
The Azure AI inference endpoint allow customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. This endpoint follows the [Azure AI model inference API](../../reference/reference-model-inference-api.md) which is supported by all the models in Azure AI model inference service.
42
+
The Azure AI inference endpoint allows customers to use a single endpoint with the same authentication and schema to generate inference for the deployed models in the resource. This endpoint follows the [Azure AI model inference API](../../reference/reference-model-inference-api.md) which is supported by all the models in Azure AI model inference service.
43
43
44
-
You can see the endpoint URL and credentials in the **Overview** section:
44
+
You can see the endpoint URL and credentials in the **Overview** section. The endpoint usually has the form `https://<resource-name>.services.ai.azure.com/models`:
45
45
46
46
:::image type="content" source="../../media/ai-services/overview/overview-endpoint-and-key.png" alt-text="An screenshot showing how to get the URL and key associated with the resource." lightbox="../../media/ai-services/overview/overview-endpoint-and-key.png":::
47
47
48
48
### Routing
49
49
50
-
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allow you to deploy a given model multiple times in the service but under different configurations if needed.
50
+
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allows you to deploy a given model multiple times in the service but under different configurations if needed.
51
51
52
52
:::image type="content" source="../../media/ai-services/endpoint/endpoint-routing.png" alt-text="An illustration showing how routing works for a Meta-llama-3.2-8b-instruct model by indicating such name in the parameter 'model' inside of the payload request." lightbox="../../media/ai-services/endpoint/endpoint-routing.png":::
53
53
@@ -71,9 +71,9 @@ All models deployed in Azure AI model inference service support the [Azure AI mo
71
71
72
72
## Azure OpenAI inference endpoint
73
73
74
-
Azure OpenAI models also support the Azure OpenAI API. This API exposes the full capabilities of OpenAI models and support additional features like assistants, threads, files, and batch inference.
74
+
Azure OpenAI models also support the Azure OpenAI API. This API exposes the full capabilities of OpenAI models and supports additional features like assistants, threads, files, and batch inference.
75
75
76
-
Azure OpenAI inference endpoints are used per-deployment and they have they own URL that is associated with only one deployment. However, the same authentication mechanism can be used to consume it. Learn more in the reference page for [Azure OpenAI API](../../../ai-services/openai/reference.md)
76
+
Each OpenAI model deployment has its own URL associated with such deployment under the Azure OpenAI inference endpoint. However, the same authentication mechanism can be used to consume it. URLs are usually in the form of `https://<resource-name>.openai.azure.com/openai/deployments/<model-deployment-name>`. Learn more in the reference page for [Azure OpenAI API](../../../ai-services/openai/reference.md)
77
77
78
78
:::image type="content" source="../../media/ai-services/endpoint/endpoint-openai.png" alt-text="An illustration showing how Azure OpenAI deployments contain a single URL for each deployment." lightbox="../../media/ai-services/endpoint/endpoint-openai.png":::
Copy file name to clipboardExpand all lines: articles/ai-studio/ai-services/faq.yml
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -40,7 +40,7 @@ sections:
40
40
- question: |
41
41
What's the difference between Azure AI model inference service and Serverless API model deployments in Azure AI studio?
42
42
answer: |
43
-
Both technologies allow you to deploy models without requiring compute resources as they are based on the Models as a Service idea. [Serverless API model deployments](../how-to/deploy-models-serverless.md) allow you to deploy a single models under a unique endpoint and credentials. You need to create a different endpoint for each model you want to deploy. On top of that, they are always created in the context of the project and while they can be shared by creating connections from other projects, they live in the context of a given project.
43
+
Both technologies allow you to deploy models without requiring compute resources as they are based on the Models as a Service idea. [Serverless API model deployments](../how-to/deploy-models-serverless.md) allow you to deploy a single model under a unique endpoint and credentials. You need to create a different endpoint for each model you want to deploy. On top of that, they are always created in the context of the project and while they can be shared by creating connections from other projects, they live in the context of a given project.
44
44
45
45
Azure AI model inference service allows you to deploy multiple models under the same endpoint and credentials. You can switch between models without changing your code. They are also in the context of a shared resource, the Azure AI Services resource, which implies you can connect the resource to any project or hub that requires to consume the models you made available. Azure AI model inference service comes with a built-in model routing capability that routes the request to the right model based on the model name you pass in the request.
46
46
@@ -79,7 +79,7 @@ sections:
79
79
- question: |
80
80
I'm making a request for a model that Azure AI model inference service supports, but I'm getting a 404 error. What should I do?
81
81
answer: |
82
-
Ensure you created a deployment for the given model and that the deployment name matches **exactly** the value you're passing in `model` parameter. Although routing isn't case sensitive, ensure there's no special punctuation or spaces as they're common mistakes.
82
+
Ensure you created a deployment for the given model and that the deployment name matches **exactly** the value you're passing in `model` parameter. Although routing isn't case sensitive, ensure there's no special punctuation or spaces typos.
83
83
- question: |
84
84
I'm using the `azure-ai-inference` package for Python and I get a 401 error when I try to authenticate using keys. What should I do?
85
85
answer: |
@@ -115,5 +115,5 @@ sections:
115
115
- question: |
116
116
Do you use my company data to train any of the models?
117
117
answer: |
118
-
Azure AI model inference don't use customer data to retrain models. Your data is never shared with model providers.
118
+
Azure AI model inference doesn't use customer data to retrain models. Your data is never shared with model providers.
Copy file name to clipboardExpand all lines: articles/ai-studio/ai-services/how-to/create-model-deployments.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,7 +40,7 @@ To use it:
40
40
41
41
:::image type="content" source="../../media/ai-services/add-model-deployments/models-deploy-endpoint-url.png" alt-text="An screenshot showing how to get the URL and key associated with the deployment." lightbox="../../media/ai-services/add-model-deployments/models-deploy-endpoint-url.png":::
42
42
43
-
2. Use the model inference endpoint URL and the keys from before when constructing your client. The following examples uses the Azure AI Inference package:
43
+
2. Use the model inference endpoint URL and the keys from before when constructing your client. The following example uses the Azure AI Inference package:
Copy file name to clipboardExpand all lines: articles/ai-studio/ai-services/how-to/quickstart-github-models.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ recommendations: false
14
14
15
15
# Upgrade from GitHub Models to the Azure AI model inference service
16
16
17
-
If you want to develop a generative AI application, you can use [GitHub Models](https://docs.github.com/en/github-models/) to find and experiment with AI models for free. The playground and free API usage are [rate limited](https://docs.github.com/en/github-models/prototyping-with-ai-models#rate-limits) by requests per minute, requests per day, tokens per request, and concurrent requests. If you get rate limited, you will need to wait for the rate limit that you hit to reset before you can make more requests.
17
+
If you want to develop a generative AI application, you can use [GitHub Models](https://docs.github.com/en/github-models/) to find and experiment with AI models for free. The playground and free API usage are [rate limited](https://docs.github.com/en/github-models/prototyping-with-ai-models#rate-limits) by requests per minute, requests per day, tokens per request, and concurrent requests. If you get rate limited, you need to wait for the rate limit that you hit to reset before you can make more requests.
18
18
19
19
Once you're ready to bring your application to production, you can upgrade your experience by deploying an Azure AI Services resource in an Azure subscription and start using the Azure AI model inference service. You don't need to change anything else in your code.
20
20
@@ -52,18 +52,18 @@ To obtain the key and endpoint:
52
52
53
53
:::image type="content" source="../../media/ai-services/add-model-deployments/models-deploy-endpoint-url.png" alt-text="An screenshot showing how to get the URL and key associated with the deployment." lightbox="../../media/ai-services/add-model-deployments/models-deploy-endpoint-url.png":::
54
54
55
-
At this point, the model you selected will be ready to consume.
55
+
At this point, the model you selected is ready to consume.
56
56
57
57
> [!TIP]
58
58
> Use the parameter `model="<deployment-name>` to route your request to this deployment. *Deployments work as an alias of a given model under certain configurations*. See [Routing](../concepts/endpoints.md#routing) concept page to learn how Azure AI Services route deployments.
59
59
60
60
## Upgrade your code to use the new endpoint
61
61
62
-
Once your Azure AI Services resource is configured, you can start consuming it from your code. You will need the endpoint URL and key for it, which can be found in the **Overview** section:
62
+
Once your Azure AI Services resource is configured, you can start consuming it from your code. You need the endpoint URL and key for it, which can be found in the **Overview** section:
63
63
64
64
:::image type="content" source="../../media/ai-services/overview/overview-endpoint-and-key.png" alt-text="An screenshot showing how to get the URL and key associated with the resource." lightbox="../../media/ai-services/overview/overview-endpoint-and-key.png":::
65
65
66
-
You can use any of the supported SDKs to get predictions out from the endpoint. The following SDKs are officially supported:
66
+
You can use any of the supported SDK's to get predictions out from the endpoint. The following SDK's are officially supported:
67
67
68
68
* OpenAI SDK
69
69
* Azure OpenAI SDK
@@ -77,14 +77,14 @@ Generate your first chat completion:
0 commit comments