Skip to content

Commit 46067ad

Browse files
committed
fix: review
1 parent 4025a6c commit 46067ad

File tree

13 files changed

+43
-38
lines changed

13 files changed

+43
-38
lines changed

articles/ai-foundry/model-inference/how-to/inference.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ To learn more about how to apply the **Azure OpenAI endpoint** see [Azure OpenAI
3434

3535
## Using the routing capability in the Azure AI model inference endpoint
3636

37-
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allows you to deploy a given model multiple times in the service but under different configurations if needed.
37+
The inference endpoint routes requests to a given deployment by matching the parameter `name` inside of the request to the name of the deployment. This means that *deployments work as an alias of a given model under certain configurations*. This flexibility allows you to deploy a given model multiple times in the service but under different configurations if needed. The inference endpoint usually has the form `https://<resource-name>.services.ai.azure.com/models`.
3838

3939
:::image type="content" source="../media/endpoint/endpoint-routing.png" alt-text="An illustration showing how routing works for a Meta-llama-3.2-8b-instruct model by indicating such name in the parameter 'model' inside of the payload request." lightbox="../media/endpoint/endpoint-routing.png":::
4040

articles/ai-foundry/model-inference/includes/code-create-chat-client-entra.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ from azure.ai.inference import ChatCompletionsClient
2626
from azure.identity import AzureDefaultCredential
2727

2828
model = ChatCompletionsClient(
29-
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
29+
endpoint="https://<resource>.services.ai.azure.com/models",
3030
credential=AzureDefaultCredential(),
3131
model="mistral-large-2407",
3232
)
@@ -48,7 +48,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
4848
import { AzureDefaultCredential } from "@azure/identity";
4949

5050
const client = new ModelClient(
51-
process.env.AZUREAI_ENDPOINT_URL,
51+
"https://<resource>.services.ai.azure.com/models",
5252
new AzureDefaultCredential(),
5353
"mistral-large-2407"
5454
);
@@ -80,7 +80,7 @@ Then, you can use the package to consume the model. The following example shows
8080

8181
```csharp
8282
ChatCompletionsClient client = new ChatCompletionsClient(
83-
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
83+
new Uri("https://<resource>.services.ai.azure.com/models"),
8484
new AzureDefaultCredential(includeInteractiveCredentials: true),
8585
"mistral-large-2407"
8686
);
@@ -108,7 +108,7 @@ Then, you can use the package to consume the model. The following example shows
108108
```java
109109
ChatCompletionsClient client = new ChatCompletionsClientBuilder()
110110
.credential(new DefaultAzureCredential()))
111-
.endpoint("{endpoint}")
111+
.endpoint("https://<resource>.services.ai.azure.com/models")
112112
.model("mistral-large-2407")
113113
.buildClient();
114114
```
@@ -122,7 +122,7 @@ Use the reference section to explore the API design and which parameters are ava
122122
__Request__
123123

124124
```HTTP/1.1
125-
POST models/chat/completions?api-version=2024-04-01-preview
125+
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
126126
Authorization: Bearer <bearer-token>
127127
Content-Type: application/json
128128
```

articles/ai-foundry/model-inference/includes/code-create-chat-client.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ from azure.ai.inference import ChatCompletionsClient
2626
from azure.core.credentials import AzureKeyCredential
2727

2828
model = ChatCompletionsClient(
29-
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
29+
endpoint="https://<resource>.services.ai.azure.com/models",
3030
credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
3131
)
3232
```
@@ -49,7 +49,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
4949
import { AzureKeyCredential } from "@azure/core-auth";
5050

5151
const client = new ModelClient(
52-
process.env.AZUREAI_ENDPOINT_URL,
52+
"https://<resource>.services.ai.azure.com/models",
5353
new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
5454
);
5555
```
@@ -76,7 +76,7 @@ Then, you can use the package to consume the model. The following example shows
7676

7777
```csharp
7878
ChatCompletionsClient client = new ChatCompletionsClient(
79-
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
79+
new Uri("https://<resource>.services.ai.azure.com/models"),
8080
new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
8181
);
8282
```
@@ -114,7 +114,7 @@ Use the reference section to explore the API design and which parameters are ava
114114
__Request__
115115

116116
```HTTP/1.1
117-
POST models/chat/completions?api-version=2024-04-01-preview
117+
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
118118
Authorization: Bearer <bearer-token>
119119
Content-Type: application/json
120120
```

articles/ai-foundry/model-inference/includes/code-create-chat-completion.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ for (ChatChoice choice : chatCompletions.getChoices()) {
7777
__Request__
7878

7979
```HTTP/1.1
80-
POST models/chat/completions?api-version=2024-04-01-preview
80+
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
8181
Authorization: Bearer <bearer-token>
8282
Content-Type: application/json
8383
```

articles/ai-foundry/model-inference/includes/code-create-embeddings-client.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ from azure.ai.inference import EmbeddingsClient
2626
from azure.core.credentials import AzureKeyCredential
2727

2828
client = EmbeddingsClient(
29-
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
29+
endpoint="https://<resource>.services.ai.azure.com/models",
3030
credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
3131
)
3232
```
@@ -39,7 +39,7 @@ from azure.ai.inference import EmbeddingsClient
3939
from azure.identity import AzureDefaultCredential
4040

4141
client = EmbeddingsClient(
42-
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
42+
endpoint="https://<resource>.services.ai.azure.com/models",
4343
credential=AzureDefaultCredential(),
4444
)
4545
```
@@ -62,7 +62,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
6262
import { AzureKeyCredential } from "@azure/core-auth";
6363

6464
const client = new ModelClient(
65-
process.env.AZUREAI_ENDPOINT_URL,
65+
"https://<resource>.services.ai.azure.com/models",
6666
new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
6767
);
6868
```
@@ -75,7 +75,7 @@ import { isUnexpected } from "@azure-rest/ai-inference";
7575
import { AzureDefaultCredential } from "@azure/identity";
7676

7777
const client = new ModelClient(
78-
process.env.AZUREAI_ENDPOINT_URL,
78+
"https://<resource>.services.ai.azure.com/models",
7979
new AzureDefaultCredential()
8080
);
8181
```
@@ -108,7 +108,7 @@ Then, you can use the package to consume the model. The following example shows
108108

109109
```csharp
110110
EmbeddingsClient client = new EmbeddingsClient(
111-
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
111+
new Uri("https://<resource>.services.ai.azure.com/models"),
112112
new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
113113
);
114114
```
@@ -117,7 +117,7 @@ For endpoint with support for Microsoft Entra ID (formerly Azure Active Director
117117

118118
```csharp
119119
EmbeddingsClient client = new EmbeddingsClient(
120-
new Uri(Environment.GetEnvironmentVariable("AZURE_INFERENCE_ENDPOINT")),
120+
new Uri("https://<resource>.services.ai.azure.com/models"),
121121
new DefaultAzureCredential(includeInteractiveCredentials: true)
122122
);
123123
```
@@ -131,7 +131,7 @@ Use the reference section to explore the API design and which parameters are ava
131131
__Request__
132132

133133
```HTTP/1.1
134-
POST models/embeddings?api-version=2024-04-01-preview
134+
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
135135
Authorization: Bearer <bearer-token>
136136
Content-Type: application/json
137137
```

articles/ai-foundry/model-inference/includes/code-create-embeddings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Console.WriteLine($"Response: {response.Data.Embeddings}");
5353
__Request__
5454

5555
```HTTP/1.1
56-
POST models/embeddings?api-version=2024-04-01-preview
56+
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
5757
Authorization: Bearer <bearer-token>
5858
Content-Type: application/json
5959
```

articles/ai-foundry/model-inference/includes/code-manage-content-filtering.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ try {
122122
__Request__
123123

124124
```HTTP/1.1
125-
POST /chat/completions?api-version=2024-04-01-preview
125+
POST /chat/completions?api-version=2024-05-01-preview
126126
Authorization: Bearer <bearer-token>
127127
Content-Type: application/json
128128
```

articles/ai-foundry/model-inference/includes/how-to-prerequisites.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ author: santiagxf
99

1010
* An Azure subscription. If you're using [GitHub Models](https://docs.github.com/en/github-models/), you can upgrade your experience and create an Azure subscription in the process. Read [Upgrade from GitHub Models to Azure AI model inference](../how-to/quickstart-github-models.md) if that's your case.
1111

12-
* An Azure AI services resource. For more information, see [Create an Azure AI Services resource](../../../ai-services/multi-service-resource.md?context=/azure/ai-services/model-inference/context/context).
12+
* An Azure AI services resource. For more information, see [Create an Azure AI Services resource](../how-to/quickstart-create-resources.md).
1313

1414
* The endpoint URL and key.
1515

articles/ai-foundry/model-inference/includes/use-chat-completions/rest.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -28,18 +28,18 @@ To use chat completion models in your application, you need:
2828

2929
## Use chat completions
3030

31-
To use the text embeddings, use the route `/chat/completions` along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
31+
To use the text embeddings, use the route `/chat/completions` appended to the base URL along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
3232

3333
```http
34-
POST /chat/completions
34+
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
3535
Content-Type: application/json
3636
api-key: <key>
3737
```
3838

3939
If you have configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header:
4040

4141
```http
42-
POST /chat/completions
42+
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
4343
Content-Type: application/json
4444
Authorization: Bearer <token>
4545
```
@@ -287,8 +287,7 @@ Some models can create JSON outputs. Set `response_format` to `json_object` to e
287287
The Azure AI Model Inference API allows you to pass extra parameters to the model. The following code example shows how to pass the extra parameter `logprobs` to the model.
288288

289289
```http
290-
POST /chat/completions HTTP/1.1
291-
Host: <ENDPOINT_URI>
290+
POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
292291
Authorization: Bearer <TOKEN>
293292
Content-Type: application/json
294293
extra-parameters: pass-through
@@ -565,7 +564,7 @@ Now, create a chat completion request with the image:
565564

566565
```json
567566
{
568-
"model": "mistral-large-2407",
567+
"model": "phi-3.5-vision-instruct",
569568
"messages": [
570569
{
571570
"role": "user",
@@ -597,7 +596,7 @@ The response is as follows, where you can see the model's usage statistics:
597596
"id": "0a1234b5de6789f01gh2i345j6789klm",
598597
"object": "chat.completion",
599598
"created": 1718726686,
600-
"model": "mistral-large-2407",
599+
"model": "phi-3.5-vision-instruct",
601600
"choices": [
602601
{
603602
"index": 0,

articles/ai-foundry/model-inference/includes/use-embeddings/rest.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,18 +28,18 @@ To use embedding models in your application, you need:
2828

2929
## Use embeddings
3030

31-
To use the text embeddings, use the route `/embeddings` along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
31+
To use the text embeddings, use the route `/embeddings` appended to the base URL along with your credential indicated in `api-key`. `Authorization` header is also supported with the format `Bearer <key>`.
3232

3333
```http
34-
POST /embeddings
34+
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
3535
Content-Type: application/json
3636
api-key: <key>
3737
```
3838

3939
If you have configured the resource with **Microsoft Entra ID** support, pass you token in the `Authorization` header:
4040

4141
```http
42-
POST /embeddings
42+
POST https://<resource>.services.ai.azure.com/models/embeddings?api-version=2024-05-01-preview
4343
Content-Type: application/json
4444
Authorization: Bearer <token>
4545
```

0 commit comments

Comments
 (0)