Skip to content

Commit fc61e6e

Browse files
authored
Merge branch 'MicrosoftDocs:main' into cosmos-update-diagnostics-settings-cli-script
2 parents 5db5a88 + 9df751c commit fc61e6e

File tree

127 files changed

+908
-414
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

127 files changed

+908
-414
lines changed

articles/advisor/advisor-reference-performance-recommendations.md

Lines changed: 38 additions & 118 deletions
Large diffs are not rendered by default.

articles/ai-services/openai/how-to/dynamic-quota.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: mrbullwinkle
77
manager: nitinme
88
ms.service: azure-ai-openai
99
ms.topic: how-to
10-
ms.date: 01/30/2024
10+
ms.date: 06/27/2024
1111
ms.author: mbullwin
1212
---
1313

@@ -34,7 +34,7 @@ For dynamic quota, consider scenarios such as:
3434

3535
### When does dynamic quota come into effect?
3636

37-
The Azure OpenAI backend decides if, when, and how much extra dynamic quota is added or removed from different deployments. It isn't forecasted or announced in advance, and isn't predictable. Azure OpenAI lets your application know there's more quota available by responding with an HTTP 429 and not letting more API calls through. To take advantage of dynamic quota, your application code must be able to issue more requests as HTTP 429 responses become infrequent.
37+
The Azure OpenAI backend decides if, when, and how much extra dynamic quota is added or removed from different deployments. It isn't forecasted or announced in advance, and isn't predictable. To take advantage of dynamic quota, your application code must be able to issue more requests as HTTP 429 responses become infrequent. Azure OpenAI lets your application know when you've hit your quota limit by responding with an HTTP 429 and not letting more API calls through.
3838

3939
### How does dynamic quota change costs?
4040

articles/ai-services/openai/how-to/migration.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ client = AzureOpenAI(
7474
)
7575

7676
response = client.chat.completions.create(
77-
model="gpt-35-turbo", # model = "deployment_name".
77+
model="gpt-35-turbo", # model = "deployment_name"
7878
messages=[
7979
{"role": "system", "content": "You are a helpful assistant."},
8080
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
@@ -135,7 +135,7 @@ deployment_name='REPLACE_WITH_YOUR_DEPLOYMENT_NAME' #This will correspond to the
135135
# Send a completion call to generate an answer
136136
print('Sending a test completion job')
137137
start_phrase = 'Write a tagline for an ice cream shop. '
138-
response = client.completions.create(model=deployment_name, prompt=start_phrase, max_tokens=10)
138+
response = client.completions.create(model=deployment_name, prompt=start_phrase, max_tokens=10) # model = "deployment_name"
139139
print(response.choices[0].text)
140140
```
141141

@@ -221,7 +221,7 @@ async def main():
221221
api_version = "2024-02-01",
222222
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
223223
)
224-
response = await client.chat.completions.create(model="gpt-35-turbo", messages=[{"role": "user", "content": "Hello world"}])
224+
response = await client.chat.completions.create(model="gpt-35-turbo", messages=[{"role": "user", "content": "Hello world"}]) # model = model deployment name
225225

226226
print(response.model_dump_json(indent=2))
227227

@@ -246,7 +246,7 @@ client = AzureOpenAI(
246246
)
247247

248248
completion = client.chat.completions.create(
249-
model="deployment-name", # gpt-35-instant
249+
model="deployment-name", # model = "deployment_name"
250250
messages=[
251251
{
252252
"role": "user",
@@ -281,7 +281,7 @@ client = openai.AzureOpenAI(
281281
)
282282

283283
completion = client.chat.completions.create(
284-
model=deployment,
284+
model=deployment, # model = "deployment_name"
285285
messages=[
286286
{
287287
"role": "user",

articles/ai-services/openai/how-to/provisioned-throughput-onboarding.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Azure OpenAI Service Provisioned Throughput Units (PTU) onboarding
33
description: Learn about provisioned throughput units onboarding and Azure OpenAI.
44
ms.service: azure-ai-openai
55
ms.topic: conceptual
6-
ms.date: 05/02/2024
6+
ms.date: 06/25/2024
77
manager: nitinme
88
author: mrbullwinkle
99
ms.author: mbullwin
@@ -44,11 +44,13 @@ The **Provisioned** option and the capacity planner are only available in certai
4444
|---|---|
4545
|Model | OpenAI model you plan to use. For example: GPT-4 |
4646
| Version | Version of the model you plan to use, for example 0614 |
47-
| Prompt tokens | Number of tokens in the prompt for each call |
48-
| Generation tokens | Number of tokens generated by the model on each call |
49-
| Peak calls per minute | Peak concurrent load to the endpoint measured in calls per minute|
47+
| Peak calls per min | The number of calls per minute that are expected to be sent to the model |
48+
| Tokens in prompt call | The number of tokens in the prompt for each call to the model. Calls with larger prompts will utilize more of the PTU deployment. Currently this calculator assumes a single prompt value so for workloads with wide variance, we recommend benchmarking your deployment on your traffic to determine the most accurate estimate of PTU needed for your deployment. |
49+
| Tokens in model response | The number of tokens generated from each call to the model. Calls with larger generation sizes will utilize more of the PTU deployment. Currently this calculator assumes a single prompt value so for workloads with wide variance, we recommend benchmarking your deployment on your traffic to determine the most accurate estimate of PTU needed for your deployment. |
5050

51-
After you fill in the required details, select **Calculate** to view the suggested PTU for your scenario.
51+
After you fill in the required details, select **Calculate** button in the output column.
52+
53+
The values in the output column are the estimated value of PTU units required for the provided workload inputs. The first output value represents the estimated PTU units required for the workload, rounded to the nearest PTU scale increment. The second output value represents the raw estimated PTU units required for the workload. The token totals are calculated using the following equation: `Total = Peak calls per minute * (Tokens in prompt call + Tokens in model response)`.
5254

5355
:::image type="content" source="../media/how-to/provisioned-onboarding/capacity-calculator.png" alt-text="Screenshot of the Azure OpenAI Studio landing page." lightbox="../media/how-to/provisioned-onboarding/capacity-calculator.png":::
5456

26.1 KB
Loading

articles/api-management/v2-service-tiers-overview.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,12 @@ The v2 tiers are available in the following regions:
5353
* France Central
5454
* Germany West Central
5555
* North Europe
56+
* West Europe
57+
* UK South
58+
* UK West
5659
* Central India
60+
* Brazil South
61+
* Australia Central
5762
* Australia East
5863
* Australia Southeast
5964
* East Asia

articles/application-gateway/application-gateway-faq.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ metadata:
66
author: greg-lindsay
77
ms.service: application-gateway
88
ms.topic: faq
9-
ms.date: 03/15/2024
9+
ms.date: 06/27/2024
1010
ms.author: greglin
1111
ms.custom: references_regions, devx-track-azurepowershell
1212
title: Frequently asked questions about Application Gateway
@@ -557,6 +557,9 @@ sections:
557557
- question: Which ports are supported for TLS/TCP listeners?
558558
answer: The same list of [allowed port range and exceptions](application-gateway-components.md#ports) apply for the Layer 4 proxy too.
559559

560+
- question: How can I use the same port number for Public and Private TLS/TCP proxy listeners?
561+
answer: The use of a common port for TLS/TCP listeners is currently not supported.
562+
560563
- name: Configuration - ingress controller for AKS
561564
questions:
562565
- question: What is an ingress controller?

articles/application-gateway/configuration-frontend-ip.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ services: application-gateway
55
author: greg-lindsay
66
ms.service: application-gateway
77
ms.topic: conceptual
8-
ms.date: 09/14/2023
8+
ms.date: 06/27/2024
99
ms.author: greglin
1010
---
1111

@@ -38,6 +38,9 @@ A frontend IP address is associated to a *listener*, which checks for incoming r
3838

3939
You can create private and public listeners with the same port number. However, be aware of any network security group (NSG) associated with the Application Gateway subnet. Depending on your NSG's configuration, you might need an allow-inbound rule with **Destination IP addresses** as your application gateway's public and private frontend IPs. When you use the same port, your application gateway changes the **Destination** of the inbound flow to the frontend IPs of your gateway.
4040

41+
> [!NOTE]
42+
> Currently, the use of the same port number for public and private TCP/TLS protocol or IPv6 listeners is not supported.
43+
4144
**Inbound rule**:
4245

4346
- **Source**: According to your requirement

0 commit comments

Comments
 (0)