Skip to content

Commit 82db19e

Browse files
authored
Merge pull request #263101 from MicrosoftDocs/main
1/12 11:00 AM IST Publish
2 parents 2c2e523 + 5454f38 commit 82db19e

File tree

65 files changed

+1047
-536
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+1047
-536
lines changed

articles/ai-services/content-safety/includes/severity-levels.md

Lines changed: 18 additions & 18 deletions
Large diffs are not rendered by default.

articles/ai-services/openai/concepts/content-filter.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -796,6 +796,10 @@ data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_
796796
797797
data: [DONE]
798798
```
799+
800+
> [!IMPORTANT]
801+
> When content filtering is triggered for a prompt and a `"status": 400` is received as part of the response there may be a charge for this request as the prompt was evaluated by the service. [Charges will also occur](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) when a `"status":200` is received with `"finish_reason": "content_filter"`. In this case the prompt did not have any issues, but the completion generated by the model was detected to violate the content filtering rules which results in the completion being filtered.
802+
799803
## Best practices
800804
801805
As part of your application design, consider the following best practices to deliver a positive experience with your application while minimizing potential harms:

articles/ai-services/openai/concepts/models.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
101101
**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
102102

103103
> [!CAUTION]
104-
> We don't recommend using these models in production. We will upgrade all deployments of these models to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
104+
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
105105
106106
> [!NOTE]
107107
> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
@@ -110,8 +110,8 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
110110

111111
| Model Availability | gpt-4 (0314) | gpt-4 (0613) | gpt-4 (1106-preview) | gpt-4 (vision-preview) |
112112
|---|:---|:---|:---|:---|
113-
| Available to all subscriptions with Azure OpenAI access | | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North | Australia East <br> Canada East <br> East US 2 <br> France Central <br> Norway East <br> South India <br> Sweden Central <br> UK South <br> West US | Switzerland North <br> West US |
114-
| Available to subscriptions with current access to the model version in the region | East US <br> France Central <br> South Central US <br> UK South | East US <br> East US 2 <br> Japan East <br> UK South | | Australia East <br>Sweden Central|
113+
| Available to all subscriptions with Azure OpenAI access | | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North | Australia East <br> Canada East <br> East US 2 <br> France Central <br> Norway East <br> South India <br> Sweden Central <br> UK South <br> West US | Sweden Central <br> Switzerland North <br> West US |
114+
| Available to subscriptions with current access to the model version in the region | East US <br> France Central <br> South Central US <br> UK South | East US <br> East US 2 <br> Japan East <br> UK South | | Australia East |
115115

116116
### GPT-3.5 models
117117

articles/ai-services/openai/includes/dall-e-python.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,6 @@ Open a command prompt and browse to your project folder. Create a new python fil
6464

6565
## Install the Python SDK
6666

67-
> [!IMPORTANT]
68-
> The latest release of the [OpenAI Python library](https://pypi.org/project/openai/) does not currently support DALL-E when used with Azure OpenAI. To access DALL-E with Azure OpenAI use version `0.28.1`.
6967

7068
Install the OpenAI Python SDK by using the following command:
7169

@@ -77,6 +75,9 @@ pip install openai
7775

7876
#### [DALL-E 2](#tab/dalle2)
7977

78+
> [!IMPORTANT]
79+
> The latest release of the [OpenAI Python library](https://pypi.org/project/openai/) does not currently support DALL-E 2 when used with Azure OpenAI. To access DALL-E 2 with Azure OpenAI use version `0.28.1`. Or, follow the [migration guide](/azure/ai-services/openai/how-to/migration?tabs=python%2Cdalle-fix) to use DALL-E 2 with OpenAI 1.x.
80+
8081
```bash
8182
pip install openai==0.28.1
8283
```

articles/aks/TOC.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -346,7 +346,11 @@
346346
- name: Proximity placement groups
347347
href: reduce-latency-ppg.md
348348
- name: Cluster Autoscaler
349-
href: cluster-autoscaler.md
349+
items:
350+
- name: Cluster Autoscaler overview
351+
href: cluster-autoscaler-overview.md
352+
- name: Use the Cluster Autoscaler on AKS
353+
href: cluster-autoscaler.md
350354
- name: Node autoprovision
351355
href: node-autoprovision.md
352356
- name: Availability Zones

articles/aks/availability-zones.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,9 @@ AKS clusters deployed using availability zones can distribute nodes across multi
6868

6969
If a single zone becomes unavailable, your applications continue to run on clusters configured to spread across multiple zones.
7070

71+
> [!NOTE]
72+
> When implementing **availability zones with the [cluster autoscaler](./cluster-autoscaler-overview.md)**, we recommend using a single node pool for each zone. You can set the `--balance-similar-node-groups` parameter to `True` to maintain a balanced distribution of nodes across zones for your workloads during scale up operations. When this approach isn't implemented, scale down operations can disrupt the balance of nodes across zones.
73+
7174
## Create an AKS cluster across availability zones
7275

7376
When you create a cluster using the [az aks create][az-aks-create] command, the `--zones` parameter specifies the availability zones to deploy agent nodes into. The availability zones that the managed control plane components are deployed into are **not** controlled by this parameter. They are automatically spread across all availability zones (if present) in the region during cluster deployment.

articles/aks/best-practices-performance-scale-large.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ Keeping the above considerations in mind, customers are typically able to deploy
7676
Always upgrade your Kubernetes clusters to the latest version. Newer versions contain many improvements that address performance and throttling issues. If you're using an upgraded version of Kubernetes and still see throttling due to the actual load or the number of clients in the subscription, you can try the following options:
7777

7878
* **Analyze errors using AKS Diagnose and Solve Problems**: You can use [AKS Diagnose and Solve Problems](./aks-diagnostics.md) to analyze errors, identity the root cause, and get resolution recommendations.
79-
* **Increase the Cluster Autoscaler scan interval**: If the diagnostic reports show that [Cluster Autoscaler throttling has been detected](/troubleshoot/azure/azure-kubernetes/429-too-many-requests-errors#analyze-and-identify-errors-by-using-aks-diagnose-and-solve-problems), you can [increase the scan interval](./cluster-autoscaler.md#change-the-cluster-autoscaler-settings) to reduce the number of calls to Virtual Machine Scale Sets from the Cluster Autoscaler.
79+
* **Increase the Cluster Autoscaler scan interval**: If the diagnostic reports show that [Cluster Autoscaler throttling has been detected](/troubleshoot/azure/azure-kubernetes/429-too-many-requests-errors#analyze-and-identify-errors-by-using-aks-diagnose-and-solve-problems), you can [increase the scan interval](./cluster-autoscaler.md#update-the-cluster-autoscaler-settings) to reduce the number of calls to Virtual Machine Scale Sets from the Cluster Autoscaler.
8080
* **Reconfigure third-party applications to make fewer calls**: If you filter by *user agents* in the ***View request rate and throttle details*** diagnostic and see that [a third-party application, such as a monitoring application, makes a large number of GET requests](/troubleshoot/azure/azure-kubernetes/429-too-many-requests-errors#analyze-and-identify-errors-by-using-aks-diagnose-and-solve-problems), you can change the settings of these applications to reduce the frequency of the GET calls. Make sure the application clients use exponential backoff when calling Azure APIs.
8181
* **Split your clusters into different subscriptions or regions**: If you have a large number of clusters and node pools that use Virtual Machine Scale Sets, you can split them into different subscriptions or regions within the same subscription. Most Azure API limits are shared at the subscription-region level, so you can move or scale your clusters to different subscriptions or regions to get unblocked on Azure API throttling. This option is especially helpful if you expect your clusters to have high activity. There are no generic guidelines for these limits. If you want specific guidance, you can create a support ticket.
8282

articles/aks/best-practices-performance-scale.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ Implementing [vertical pod autoscaling](./vertical-pod-autoscaler.md) is useful
5757

5858
Implementing cluster autoscaling is useful if your existing nodes lack sufficient capacity, as it helps with scaling up and provisioning new nodes.
5959

60-
When considering cluster autoscaling, the decision of when to remove a node involves a tradeoff between optimizing resource utilization and ensuring resource availability. Eliminating underutilized nodes enhances cluster utilization but might result in new workloads having to wait for resources to be provisioned before they can be deployed. It's important to find a balance between these two factors that aligns with your cluster and workload requirements and [configure the cluster autoscaler profile settings accordingly](./cluster-autoscaler.md#change-the-cluster-autoscaler-settings).
60+
When considering cluster autoscaling, the decision of when to remove a node involves a tradeoff between optimizing resource utilization and ensuring resource availability. Eliminating underutilized nodes enhances cluster utilization but might result in new workloads having to wait for resources to be provisioned before they can be deployed. It's important to find a balance between these two factors that aligns with your cluster and workload requirements and [configure the cluster autoscaler profile settings accordingly](./cluster-autoscaler.md#update-the-cluster-autoscaler-settings).
6161

6262
The Cluster Autoscaler profile settings apply universally to all autoscaler-enabled node pools in your cluster. This means that any scaling actions occurring in one autoscaler-enabled node pool might impact the autoscaling behavior in another node pool. It's important to apply consistent and synchronized profile settings across all relevant node pools to ensure that the autoscaler behaves as expected.
6363

@@ -234,7 +234,7 @@ The following table provides a breakdown of suggested use cases for OS disks sup
234234

235235
#### IOPS and throughput
236236

237-
Input/output operations per second (IOPS) refers to the number of read and write operations that a disk can perform in a second. Throughout refers to the amount of data that can be transferred in a given time period.
237+
Input/output operations per second (IOPS) refers to the number of read and write operations that a disk can perform in a second. Throughput refers to the amount of data that can be transferred in a given time period.
238238

239239
OS disks are responsible for storing the operating system and its associated files, and the VMs are responsible for running the applications. When selecting a VM, ensure the size and performance of the OS disk and VM SKU don't have a large discrepancy. A discrepancy in size or performance can cause performance issues and resource contention. For example, if the OS disk is significantly smaller than the VMs, it can limit the amount of space available for application data and cause the system to run out of disk space. If the OS disk has lower performance than the VMs, it can become a bottleneck and limit the overall performance of the system. Make sure the size and performance are balanced to ensure optimal performance in Kubernetes.
240240

0 commit comments

Comments
 (0)