Skip to content

Commit d14b700

Browse files
committed
Merge branch 'main' of https://github.com/MicrosoftDocs/azure-docs-pr into rolyon-rbac-classic-admins-faq-update
2 parents c23770d + f567f28 commit d14b700

File tree

78 files changed

+623
-2016
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+623
-2016
lines changed

.openpublishing.redirection.azure-monitor.json

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4131,7 +4131,7 @@
41314131
},
41324132
{
41334133
"source_path_from_root": "/articles/azure-monitor/platform/alerts-using-migration-tool.md",
4134-
"redirect_url": "/azure/azure-monitor/alerts/alerts-using-migration-tool",
4134+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-using-migration-tool",
41354135
"redirect_document_id": false
41364136
},
41374137
{
@@ -6259,6 +6259,46 @@
62596259
"redirect_url": "/previous-versions/azure/azure-monitor/autoscale/tutorial-autoscale-performance-schedule",
62606260
"redirect_document_id": false
62616261
},
6262+
{
6263+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-automatic-migration.md",
6264+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-automatic-migration",
6265+
"redirect_document_id": false
6266+
},
6267+
{
6268+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-classic.overview.md",
6269+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-classic.overview",
6270+
"redirect_document_id": false
6271+
},
6272+
{
6273+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-classic-portal.md",
6274+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-classic-portal",
6275+
"redirect_document_id": false
6276+
},
6277+
{
6278+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-enable-template.md",
6279+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-enable-template",
6280+
"redirect_document_id": false
6281+
},
6282+
{
6283+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-prepare-migration.md",
6284+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-prepare-migration",
6285+
"redirect_document_id": false
6286+
},
6287+
{
6288+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-understand-migration.md",
6289+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-understand-migration",
6290+
"redirect_document_id": false
6291+
},
6292+
{
6293+
"source_path_from_root": "/articles/azure-monitor/alerts/alerts-webhooks.md",
6294+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/alerts-webhooks",
6295+
"redirect_document_id": false
6296+
},
6297+
{
6298+
"source_path_from_root": "/articles/azure-monitor/alerts/api-alerts.md",
6299+
"redirect_url": "/previous-versions/azure/azure-monitor/alerts/api-alerts",
6300+
"redirect_document_id": false
6301+
},
62626302
{
62636303
"source_path_from_root": "/articles/azure-monitor/essentials/metrics-supported.md",
62646304
"redirect_url": "/azure/azure-monitor/reference/supported-metrics/metrics-index",

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -350,6 +350,10 @@ You can modify the following additional settings in the **Data parameters** sect
350350
|**Retrieved documents** | This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information might be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. This is the `topNDocuments` parameter in the API, and is 5 by default. |
351351
| **Strictness** | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API, and set to 3 by default. |
352352

353+
### Uncited references
354+
355+
It's possible for the model to return `"TYPE":"UNCITED_REFERENCE"` instead of `"TYPE":CONTENT` in the API for documents that are retrieved from the data source, but not included in the citation. This can be useful for debugging, and you can control this behavior by modifying the **strictness** and **retrieved documents** runtime parameters described above.
356+
353357
### System message
354358

355359
You can define a system message to steer the model's reply when using Azure OpenAI On Your Data. This message allows you to customize your replies on top of the retrieval augmented generation (RAG) pattern that Azure OpenAI On Your Data uses. The system message is used in addition to an internal base prompt to provide the experience. To support this, we truncate the system message after a specific [number of tokens](#token-usage-estimation-for-azure-openai-on-your-data) to ensure the model can answer questions using your data. If you are defining extra behavior on top of the default experience, ensure that your system prompt is detailed and explains the exact expected customization.
@@ -550,10 +554,9 @@ token_output = TokenEstimator.estimate_tokens(input_text)
550554

551555
## Troubleshooting
552556

553-
### Failed ingestion jobs
554-
555-
To troubleshoot a failed job, always look out for errors or warnings specified either in the API response or Azure OpenAI studio. Here are some of the common errors and warnings:
557+
To troubleshoot failed operations, always look out for errors or warnings specified either in the API response or Azure OpenAI studio. Here are some of the common errors and warnings:
556558

559+
### Failed ingestion jobs
557560

558561
**Quota Limitations Issues**
559562

@@ -583,6 +586,10 @@ Resolution:
583586

584587
This means the storage account isn't accessible with the given credentials. In this case, please review the storage account credentials passed to the API and ensure the storage account isn't hidden behind a private endpoint (if a private endpoint isn't configured for this resource).
585588

589+
### 503 errors when sending queries with Azure AI Search
590+
591+
Each user message can translate to multiple search queries, all of which get sent to the search resource in parallel. This can produce throttling behavior when the amount of search replicas and partitions is low. The maximum number of queries per second that a single partition and single replica can support may not be sufficient. In this case, consider increasing your replicas and partitions, or adding sleep/retry logic in your application. See the [Azure AI Search documentation](../../../search/performance-benchmarks.md) for more information.
592+
586593
## Regional availability and model support
587594

588595
You can use Azure OpenAI On Your Data with an Azure OpenAI resource in the following regions:

articles/ai-services/openai/includes/chat-markup-language.md renamed to articles/ai-services/openai/how-to/chat-markup-language.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,20 @@ author: mrbullwinkle #dereklegenzoff
66
ms.author: mbullwin #delegenz
77
ms.service: azure-ai-openai
88
ms.topic: include
9-
ms.date: 05/15/2023
9+
ms.date: 04/05/2024
1010
manager: nitinme
1111
keywords: ChatGPT
1212
---
1313

14-
## Working with the Chat models
14+
# Chat Markup Language ChatML (Preview)
1515

1616
> [!IMPORTANT]
17-
> Using GPT-35-Turbo models with the completion endpoint as described in this article remains in preview and is only possible with gpt-35-turbo verision (0301). We strongly recommend using the GA Chat Completion API/endpoint. The Chat Completion API is the recommended method of interacting with the GPT-35-Turbo models. The Chat Completion API is also the only way to access the GPT-4 models.
17+
> Using GPT-3.5-Turbo models with the completion endpoint as described in this article remains in preview and is only possible with `gpt-35-turbo` version (0301) which is [slated for retirement as early as June 13th, 2024](../concepts/model-retirements.md#current-models). We strongly recommend using the [GA Chat Completion API/endpoint](./chatgpt.md). The Chat Completion API is the recommended method of interacting with the GPT-3.5-Turbo models. The Chat Completion API is also the only way to access the GPT-4 models.
1818
19-
The following code snippet shows the most basic way to use the GPT-35-Turbo models with ChatML. If this is your first time using these models programmatically we recommend starting with our [GPT-35-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
19+
The following code snippet shows the most basic way to use the GPT-3.5-Turbo models with ChatML. If this is your first time using these models programmatically we recommend starting with our [GPT-35-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
20+
21+
> [!NOTE]
22+
> In the Azure OpenAI documentation we refer to GPT-3.5-Turbo, and GPT-35-Turbo interchangeably. The official name of the model on OpenAI is `gpt-3.5-turbo`, but for Azure OpenAI due to Azure specific character constraints the underlying model name is `gpt-35-turbo`.
2023
2124
```python
2225
import os
@@ -40,7 +43,7 @@ print(response['choices'][0]['text'])
4043
> [!NOTE]
4144
> The following parameters aren't available with the gpt-35-turbo model: `logprobs`, `best_of`, and `echo`. If you set any of these parameters, you'll get an error.
4245
43-
The `<|im_end|>` token indicates the end of a message. We recommend including `<|im_end|>` token as a stop sequence to ensure that the model stops generating text when it reaches the end of the message.
46+
The `<|im_end|>` token indicates the end of a message. When using ChatML it is recommended to include `<|im_end|>` token as a stop sequence to ensure that the model stops generating text when it reaches the end of the message.
4447

4548
Consider setting `max_tokens` to a slightly higher value than normal such as 300 or 500. This ensures that the model doesn't stop generating text before it reaches the end of the message.
4649

@@ -53,8 +56,6 @@ Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model as well as th
5356

5457
You can find the model retirement dates for these models on our [models](../concepts/models.md) page.
5558

56-
<a id="chatml"></a>
57-
5859
## Working with Chat Markup Language (ChatML)
5960

6061
> [!NOTE]

articles/ai-services/openai/how-to/chatgpt.md

Lines changed: 5 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -7,35 +7,17 @@ ms.author: mbullwin #delegenz
77
ms.service: azure-ai-openai
88
ms.custom: build-2023, build-2023-dataai
99
ms.topic: how-to
10-
ms.date: 03/29/2024
10+
ms.date: 04/05/2024
1111
manager: nitinme
1212
keywords: ChatGPT
1313
zone_pivot_groups: openai-chat
1414
---
1515

16-
# Learn how to work with the GPT-35-Turbo and GPT-4 models
16+
# Learn how to work with the GPT-3.5-Turbo and GPT-4 models
1717

18-
The GPT-35-Turbo and GPT-4 models are language models that are optimized for conversational interfaces. The models behave differently than the older GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the GPT-35-Turbo and GPT-4 models are conversation-in and message-out. The models expect input formatted in a specific chat-like transcript format, and return a completion that represents a model-written message in the chat. While this format was designed specifically for multi-turn conversations, you'll find it can also work well for non-chat scenarios too.
18+
The GPT-3.5-Turbo and GPT-4 models are language models that are optimized for conversational interfaces. The models behave differently than the older GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the GPT-3.5-Turbo and GPT-4 models are conversation-in and message-out. The models expect input formatted in a specific chat-like transcript format, and return a completion that represents a model-written message in the chat. While this format was designed specifically for multi-turn conversations, you'll find it can also work well for non-chat scenarios too.
1919

20-
In Azure OpenAI there are two different options for interacting with these type of models:
20+
This article walks you through getting started with the GPT-3.5-Turbo and GPT-4 models. It's important to use the techniques described here to get the best results. If you try to interact with the models the same way you did with the older model series, the models will often be verbose and provide less useful responses.
2121

22-
- Chat Completion API.
23-
- Completion API with Chat Markup Language (ChatML).
22+
[!INCLUDE [Chat Completions](../includes/chat-completion.md)]
2423

25-
The Chat Completion API is a new dedicated API for interacting with the GPT-35-Turbo and GPT-4 models. This API is the preferred method for accessing these models. **It is also the only way to access the new GPT-4 models**.
26-
27-
ChatML uses the same [completion API](../reference.md#completions) that you use for other models like text-davinci-002, it requires a unique token based prompt format known as Chat Markup Language (ChatML). This provides lower level access than the dedicated Chat Completion API, but also requires additional input validation, only supports gpt-35-turbo models, and **the underlying format is more likely to change over time**.
28-
29-
This article walks you through getting started with the GPT-35-Turbo and GPT-4 models. It's important to use the techniques described here to get the best results. If you try to interact with the models the same way you did with the older model series, the models will often be verbose and provide less useful responses.
30-
31-
::: zone pivot="programming-language-chat-completions"
32-
33-
[!INCLUDE [Studio quickstart](../includes/chat-completion.md)]
34-
35-
::: zone-end
36-
37-
::: zone pivot="programming-language-chat-ml"
38-
39-
[!INCLUDE [Python SDK quickstart](../includes/chat-markup-language.md)]
40-
41-
::: zone-end

articles/ai-services/openai/how-to/use-web-app.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ Sample source code for the web app is available on [GitHub](https://github.com/m
5252
5353
We recommend pulling changes from the `main` branch for the web app's source code frequently to ensure you have the latest bug fixes, API version, and improvements. Additionally, the web app must be synchronized every time the API version being used is [retired](../api-version-deprecation.md#retiring-soon).
5454

55+
Consider either clicking the **watch** or **star** buttons on the web app's [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT) repo to be notified about changes and updates to the source code.
56+
5557
**If you haven't customized the app:**
5658
* You can follow the synchronization steps below
5759

articles/ai-services/openai/how-to/use-your-data-securely.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-openai
88
ms.topic: how-to
99
author: aahill
1010
ms.author: aahi
11-
ms.date: 02/13/2024
11+
ms.date: 04/05/2024
1212
recommendations: false
1313
---
1414

@@ -332,7 +332,7 @@ Make sure your sign-in credential has `Cognitive Services OpenAI Contributor` ro
332332
### Ingestion API
333333

334334

335-
See the [ingestion API reference article](/azure/ai-services/openai/reference#start-an-ingestion-job) for details on the request and response objects used by the ingestion API.
335+
See the [ingestion API reference article](/rest/api/azureopenai/ingestion-jobs?context=/azure/ai-services/openai/context/context) for details on the request and response objects used by the ingestion API.
336336

337337
More notes:
338338

articles/ai-services/openai/includes/chat-completion.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: mrbullwinkle #dereklegenzoff
66
ms.author: mbullwin #delegenz
77
ms.service: azure-ai-openai
88
ms.topic: include
9-
ms.date: 11/02/2023
9+
ms.date: 04/05/2024
1010
manager: nitinme
1111
keywords: ChatGPT
1212

@@ -16,6 +16,9 @@ keywords: ChatGPT
1616

1717
The following code snippet shows the most basic way to use the GPT-3.5-Turbo and GPT-4 models with the Chat Completion API. If this is your first time using these models programmatically, we recommend starting with our [GPT-3.5-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
1818

19+
> [!NOTE]
20+
> In the Azure OpenAI documentation we refer to GPT-3.5-Turbo, and GPT-35-Turbo interchangeably. The official name of the model on OpenAI is `gpt-3.5-turbo`, but for Azure OpenAI due to Azure specific character constraints the underlying model name is `gpt-35-turbo`.
21+
1922
# [OpenAI Python 1.x](#tab/python-new)
2023

2124
```python
@@ -173,7 +176,7 @@ Every response includes a `finish_reason`. The possible values for `finish_reaso
173176
* **stop**: API returned complete model output.
174177
* **length**: Incomplete model output due to max_tokens parameter or token limit.
175178
* **content_filter**: Omitted content due to a flag from our content filters.
176-
* **null**:API response still in progress or incomplete.
179+
* **null**: API response still in progress or incomplete.
177180

178181
Consider setting `max_tokens` to a slightly higher value than normal such as 300 or 500. This ensures that the model doesn't stop generating text before it reaches the end of the message.
179182

@@ -361,7 +364,7 @@ while True:
361364

362365
---
363366

364-
When you run the code above you will get a blank console window. Enter your first question in the window and then hit enter. Once the response is returned, you can repeat the process and keep asking questions.
367+
When you run the code above you'll get a blank console window. Enter your first question in the window and then hit enter. Once the response is returned, you can repeat the process and keep asking questions.
365368

366369
## Managing conversations
367370

@@ -542,6 +545,16 @@ An alternative approach is to limit the conversation duration to the max token l
542545

543546
The token counting portion of the code demonstrated previously is a simplified version of one of [OpenAI's cookbook examples](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_format_inputs_to_ChatGPT_models.ipynb).
544547

548+
## Troubleshooting
549+
550+
### Don't use ChatML syntax with the Chat Completions endpoint
551+
552+
We have found that some customers will try using the [legacy ChatML syntax](../how-to/chat-markup-language.md) with the chat completion endpoints and newer models. ChatML was a preview capability that only worked with the legacy completions endpoint with the `gpt-35-turbo` version 0301 model which is [slated for retirement](../concepts/model-retirements.md). Attempting to use ChatML syntax with newer models and the chat completions endpoint can result errors as well as unexpected model response behavior, and is not recommended.
553+
554+
| Error |Cause | Solution |
555+
|---|---|---|
556+
| 400 - *Failed to generate output due to special tokens in the input.* | Your prompt contains legacy ChatML tokens not recognized or supported by the model/endpoint. | Ensure that your prompt/messages array does not contain any legacy ChatML tokens. If you are upgrading from a legacy model, please exclude all special tokens before submitting an API request to the model.|
557+
545558
## Next steps
546559

547560
* [Learn more about Azure OpenAI](../overview.md).

articles/ai-services/openai/toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,10 @@ items:
111111
href: ./how-to/integrate-synapseml.md
112112
- name: Deploy and use web apps
113113
href: ./how-to/use-web-app.md
114+
- name: Legacy
115+
items:
116+
- name: Chat Markup Language (ChatML)
117+
href: ./how-to/chat-markup-language.md
114118
- name: Content filtering
115119
href: ./how-to/content-filters.md
116120
- name: Use blocklists

articles/ai-studio/how-to/develop-in-vscode.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ ms.custom:
99
ms.topic: how-to
1010
ms.date: 1/10/2024
1111
ms.reviewer: eur
12-
ms.author: eur
13-
author: eric-urban
12+
ms.author: sgilley
13+
author: sdgilley
1414
---
1515

1616
# Get started with Azure AI projects in VS Code

0 commit comments

Comments
 (0)