Skip to content

Commit 5c731d5

Browse files
committed
Merge branch 'main' into release-blob-backup
2 parents 64937c2 + 6d1aa20 commit 5c731d5

File tree

279 files changed

+5630
-2653
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

279 files changed

+5630
-2653
lines changed

.openpublishing.redirection.azure-monitor.json

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5564,6 +5564,41 @@
55645564
"redirect_url": "/azure/azure-monitor/essentials/resource-manager-diagnostic-settings#diagnostic-setting-for-activity-log",
55655565
"redirect_document_id": false
55665566
},
5567+
{
5568+
"source_path_from_root": "/articles/azure-monitor/app/availability-overview.md",
5569+
"redirect_url": "/azure/azure-monitor/app/availability",
5570+
"redirect_document_id": false
5571+
},
5572+
{
5573+
"source_path_from_root": "/articles/azure-monitor/app/availability-standard-tests.md",
5574+
"redirect_url": "/azure/azure-monitor/app/availability",
5575+
"redirect_document_id": false
5576+
},
5577+
{
5578+
"source_path_from_root": "/articles/azure-monitor/app/availability-azure-functions.md",
5579+
"redirect_url": "/azure/azure-monitor/app/availability",
5580+
"redirect_document_id": false
5581+
},
5582+
{
5583+
"source_path_from_root": "/articles/azure-monitor/app/availability-private-test.md",
5584+
"redirect_url": "/azure/azure-monitor/app/availability",
5585+
"redirect_document_id": false
5586+
},
5587+
{
5588+
"source_path_from_root": "/articles/azure-monitor/app/availability-alerts.md",
5589+
"redirect_url": "/azure/azure-monitor/app/availability",
5590+
"redirect_document_id": false
5591+
},
5592+
{
5593+
"source_path_from_root": "/articles/azure-monitor/app/availability-test-migration.md",
5594+
"redirect_url": "/azure/azure-monitor/app/availability",
5595+
"redirect_document_id": false
5596+
},
5597+
{
5598+
"source_path_from_root": "/articles/azure-monitor/app/sla-report.md",
5599+
"redirect_url": "/azure/azure-monitor/app/availability",
5600+
"redirect_document_id": false
5601+
},
55675602
{
55685603
"source_path_from_root": "/articles/azure-monitor/app/tutorial-alert.md",
55695604
"redirect_url": "/azure/azure-monitor/app/availability-standard-tests",

articles/ai-services/openai/how-to/monitoring.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: mbullwin
66
ms.service: azure-ai-openai
77
ms.topic: how-to
88
ms.custom: subject-monitoring
9-
ms.date: 04/16/2024
9+
ms.date: 07/12/2024
1010
---
1111

1212
# Monitoring Azure OpenAI Service
@@ -56,6 +56,7 @@ The following table summarizes the current subset of metrics available in Azure
5656
|Metric|Category|Aggregation|Description|Dimensions|
5757
|---|---|---|---|---|
5858
|`Azure OpenAI Requests`|HTTP|Count|Total number of calls made to the Azure OpenAI API over a period of time. Applies to PayGo, PTU, and PTU-managed SKUs.| `ApiName`, `ModelDeploymentName`,`ModelName`,`ModelVersion`, `OperationName`, `Region`, `StatusCode`, `StreamType`|
59+
| `Active Tokens` | Usage | Total tokens minus cached tokens over a period of time. Applies to PTU and PTU-managed deployments. Use this metric to understand your TPS or TPM based utilization for PTUs and compare to your benchmarks for target TPS or TPM for your scenarios. | `ModelDeploymentName`,`ModelName`,`ModelVersion` |
5960
| `Generated Completion Tokens` | Usage | Sum | Number of generated tokens (output) from an Azure OpenAI model. Applies to PayGo, PTU, and PTU-manged SKUs | `ApiName`, `ModelDeploymentName`,`ModelName`, `Region`|
6061
| `Processed FineTuned Training Hours` | Usage |Sum| Number of training hours processed on an Azure OpenAI fine-tuned model. | `ApiName`, `ModelDeploymentName`,`ModelName`, `Region`|
6162
| `Processed Inference Tokens` | Usage | Sum| Number of inference tokens processed by an Azure OpenAI model. Calculated as prompt tokens (input) + generated tokens. Applies to PayGo, PTU, and PTU-manged SKUs.|`ApiName`, `ModelDeploymentName`,`ModelName`, `Region`|
@@ -64,6 +65,7 @@ The following table summarizes the current subset of metrics available in Azure
6465
|`Prompt Token Cache Match Rate` | HTTP | Average | **Provisioned-managed only**. The prompt token cache hit ration expressed as a percentage. | `ModelDeploymentName`, `ModelVersion`, `ModelName`, `Region`|
6566
|`Time to Response` | HTTP | Average | Recommended latency (responsiveness) measure for streaming requests. **Applies to PTU, and PTU-managed deployments**. This metric does not apply to standard pay-go deployments. Calculated as time taken for the first response to appear after a user sends a prompt, as measured by the API gateway. This number increases as the prompt size increases and/or cache hit size reduces. Note: this metric is an approximation as measured latency is heavily dependent on multiple factors, including concurrent calls and overall workload pattern. In addition, it does not account for any client- side latency that may exist between your client and the API endpoint. Please refer to your own logging for optimal latency tracking.| `ModelDepIoymentName`, `ModelName`, and `ModelVersion` |
6667

68+
6769
## Configure diagnostic settings
6870

6971
All of the metrics are exportable with [diagnostic settings in Azure Monitor](/azure/azure-monitor/essentials/diagnostic-settings). To analyze logs and metrics data with Azure Monitor Log Analytics queries, you need to configure diagnostic settings for your Azure OpenAI resource and your Log Analytics workspace.

articles/ai-studio/how-to/model-catalog-overview.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,8 @@ Phi-3-mini-128k-instruct <br> Phi-3-medium-4k-instruct <br> Phi-3-medium-128k-in
149149

150150
[!INCLUDE [Feature preview](~/reusable-content/ce-skilling/azure/includes/ai-studio/includes/feature-preview.md)]
151151

152-
Azure AI Studio implements a default configuration of [Azure AI Content Safety](../../ai-services/content-safety/overview.md) text moderation filters for harmful content (hate, self-harm, sexual, and violence) in language models deployed with MaaS. To learn more about content filtering (preview), see [harm categories in Azure AI Content Safety](../../ai-services/content-safety/concepts/harm-categories.md). Content filtering (preview) occurs synchronously as the service processes prompts to generate content, and you may be billed separately as per [AACS pricing](https://azure.microsoft.com/pricing/details/cognitive-services/content-safety/) for such use. You can disable content filtering for individual serverless endpoints when you first deploy a language model or in the deployment details page by clicking the content filtering toggle. You may be at higher risk of exposing users to harmful content if you turn off content filters.
152+
[!INCLUDE [content-safety-serverless-models](../includes/content-safety-serverless-models.md)]
153+
153154

154155
### Network isolation for models deployed via Serverless APIs
155156

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
title: include file
3+
description: include file
4+
ms.service: azure-ai-studio
5+
ms.topic: include
6+
ms.date: 07/12/2024
7+
ms.author: mopeakande
8+
author: msakande
9+
ms.reviewer: osiotugo
10+
reviewer: ositanachi
11+
ms.custom: include file
12+
13+
# Also used in Azure Machine Learning documentation
14+
---
15+
16+
For language models deployed via serverless APIs, Azure AI implements a default configuration of [Azure AI Content Safety](/azure/ai-services/content-safety/overview) text moderation filters that detect harmful content such as hate, self-harm, sexual, and violent content. To learn more about content filtering (preview), see [harm categories in Azure AI Content Safety](/azure/ai-services/content-safety/concepts/harm-categories).
17+
18+
> [!TIP]
19+
> Content filtering (preview) is not available for certain model types that are deployed via serverless APIs. These model types include embed models and time series models.
20+
21+
Content filtering (preview) occurs synchronously as the service processes prompts to generate content, and you might be billed separately as per [AACS pricing](https://azure.microsoft.com/pricing/details/cognitive-services/content-safety/) for such use. You can disable content filtering (preview) for individual serverless endpoints either at the time when you first deploy a language model or later in the deployment details page by selecting the content filtering toggle.
22+
23+
Suppose you decide to use an API other than the [Azure AI Model Inference API](/azure/ai-studio/reference/reference-model-inference-api) to work with a model that's deployed via a serverless API. In such a situation, content filtering (preview) isn't enabled unless you implement it separately by using Azure AI Content Safety. To learn more about getting started with Azure AI Content Safety, see [Quickstart: Analyze text content](/azure/ai-services/content-safety/quickstart-text). If you don't use content filtering (preview) when working with models that are deployed via serverless APIs, you run a higher risk of exposing users to harmful content.
24+

articles/ai-studio/reference/reference-model-inference-api.md

Lines changed: 39 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,19 @@ model = ChatCompletionsClient(
103103
)
104104
```
105105

106+
If you are using an endpoint with support for Entra ID, you can create your client as follows:
107+
108+
```python
109+
import os
110+
from azure.ai.inference import ChatCompletionsClient
111+
from azure.identity import AzureDefaultCredential
112+
113+
model = ChatCompletionsClient(
114+
endpoint=os.environ["AZUREAI_ENDPOINT_URL"],
115+
credential=AzureDefaultCredential(),
116+
)
117+
```
118+
106119
# [JavaScript](#tab/javascript)
107120

108121
Install the package `@azure-rest/ai-inference` using npm:
@@ -124,6 +137,19 @@ const client = new ModelClient(
124137
);
125138
```
126139

140+
For endpoint with support for Microsoft Entra ID, you can create your client as follows:
141+
142+
```javascript
143+
import ModelClient from "@azure-rest/ai-inference";
144+
import { isUnexpected } from "@azure-rest/ai-inference";
145+
import { AzureDefaultCredential } from "@azure/identity";
146+
147+
const client = new ModelClient(
148+
process.env.AZUREAI_ENDPOINT_URL,
149+
new AzureDefaultCredential()
150+
);
151+
```
152+
127153
# [REST](#tab/rest)
128154

129155
Use the reference section to explore the API design and which parameters are available. For example, the reference section for [Chat completions](reference-model-inference-chat-completions.md) details how to use the route `/chat/completions` to generate predictions based on chat-formatted instructions:
@@ -143,11 +169,13 @@ The Azure AI Model Inference API specifies a set of modalities and parameters th
143169

144170
By setting a header `extra-parameters: pass-through`, the API will attempt to pass any unknown parameter directly to the underlying model. If the model can handle that parameter, the request completes.
145171

146-
The following example shows a request passing the parameter `safe_prompt` supported by Mistral-Large, which isn't specified in the Azure AI Model Inference API:
172+
The following example shows a request passing the parameter `safe_prompt` supported by Mistral-Large, which isn't specified in the Azure AI Model Inference API.
147173

148174
# [Python](#tab/python)
149175

150176
```python
177+
from azure.ai.inference.models import SystemMessage, UserMessage
178+
151179
response = model.complete(
152180
messages=[
153181
SystemMessage(content="You are a helpful assistant."),
@@ -157,8 +185,13 @@ response = model.complete(
157185
"safe_mode": True
158186
}
159187
)
188+
189+
print(response.choices[0].message.content)
160190
```
161191

192+
> [!TIP]
193+
> When using Azure AI Inference SDK, using passing extra parameters using `model_extras` configures the request with `extra-parameters: pass-through` automatically for you.
194+
162195
# [JavaScript](#tab/javascript)
163196

164197
```javascript
@@ -174,6 +207,8 @@ var response = await client.path("/chat/completions").post({
174207
safe_mode: true
175208
}
176209
});
210+
211+
console.log(response.choices[0].message.content)
177212
```
178213

179214
# [REST](#tab/rest)
@@ -208,8 +243,8 @@ extra-parameters: pass-through
208243

209244
---
210245

211-
> [!TIP]
212-
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: ignore` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
246+
> [!NOTE]
247+
> The default value for `extra-parameters` is `error` which returns an error if an extra parameter is indicated in the payload. Alternatively, you can set `extra-parameters: drop` to drop any unknown parameter in the request. Use this capability in case you happen to be sending requests with extra parameters that you know the model won't support but you want the request to completes anyway. A typical example of this is indicating `seed` parameter.
213248
214249
### Models with disparate set of capabilities
215250

@@ -220,7 +255,7 @@ The following example shows the response for a chat completion request indicatin
220255
# [Python](#tab/python)
221256

222257
```python
223-
from azure.ai.inference.models import ChatCompletionsResponseFormat
258+
from azure.ai.inference.models import SystemMessage, UserMessage, ChatCompletionsResponseFormat
224259
from azure.core.exceptions import HttpResponseError
225260
import json
226261

articles/aks/TOC.yml

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,17 @@
337337
- name: Use the Azure portal
338338
href: virtual-nodes-portal.md
339339
- name: Workloads
340-
items:
340+
items:
341+
- name: Stateful workloads
342+
items:
343+
- name: Deploy a highly available PostgreSQL database
344+
items:
345+
- name: Overview
346+
href: postgresql-ha-overview.md
347+
- name: Create infrastructure resources
348+
href: create-postgresql-ha.md
349+
- name: Deploy and test PostgreSQL
350+
href: deploy-postgresql-ha.md
341351
- name: GPU workloads
342352
items:
343353
- name: Use GPUs
@@ -973,4 +983,4 @@
973983
- name: Support options for AKS
974984
href: aks-support-help.md
975985
- name: Troubleshooting documentation for AKS
976-
href: /troubleshoot/azure/azure-kubernetes/welcome-azure-kubernetes
986+
href: /troubleshoot/azure/azure-kubernetes/welcome-azure-kubernetes

0 commit comments

Comments
 (0)