You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/assistants.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,11 +3,11 @@ title: Azure OpenAI Service Assistants API concepts
3
3
titleSuffix: Azure OpenAI Service
4
4
description: Learn about the concepts behind the Azure OpenAI Assistants API.
5
5
ms.topic: conceptual
6
-
ms.date: 03/04/2024
6
+
ms.date: 08/21/2024
7
7
ms.service: azure-ai-openai
8
8
manager: nitinme
9
-
author: mrbullwinkle
10
-
ms.author: mbullwin
9
+
author: aahill
10
+
ms.author: aahi
11
11
recommendations: false
12
12
---
13
13
@@ -71,21 +71,21 @@ The Assistants API has support for several parameters that let you customize the
71
71
72
72
## Context window management
73
73
74
-
Assistants automatically truncates text to ensure it stays within the model's maximum context length. You can customize this behavior by specifying the maximum tokens you'd like a run to utilize and/or the maximum number of recent messages you'd like to include in a run.
74
+
Assistants automatically truncate text to ensure it stays within the model's maximum context length. You can customize this behavior by specifying the maximum tokens you'd like a run to utilize and/or the maximum number of recent messages you'd like to include in a run.
75
75
76
76
### Max completion and max prompt tokens
77
77
78
78
To control the token usage in a single Run, set `max_prompt_tokens` and `max_completion_tokens` when you create the Run. These limits apply to the total number of tokens used in all completions throughout the Run's lifecycle.
79
79
80
-
For example, initiating a Run with `max_prompt_tokens` set to 500 and `max_completion_tokens` set to 1000 means the first completion will truncate the thread to 500 tokens and cap the output at 1000 tokens. If only 200 prompt tokens and 300 completion tokens are used in the first completion, the second completion will have available limits of 300 prompt tokens and 700 completion tokens.
80
+
For example, initiating a Run with `max_prompt_tokens` set to 500 and `max_completion_tokens` set to 1000 means the first completion will truncate the thread to 500 tokens and cap the output at 1,000 tokens. If only 200 prompt tokens and 300 completion tokens are used in the first completion, the second completion will have available limits of 300 prompt tokens and 700 completion tokens.
81
81
82
82
If a completion reaches the `max_completion_tokens` limit, the Run will terminate with a status of incomplete, and details will be provided in the `incomplete_details` field of the Run object.
83
83
84
84
When using the File Search tool, we recommend setting the `max_prompt_tokens` to no less than 20,000. For longer conversations or multiple interactions with File Search, consider increasing this limit to 50,000, or ideally, removing the `max_prompt_tokens` limits altogether to get the highest quality results.
85
85
86
86
## Truncation strategy
87
87
88
-
You may also specify a truncation strategy to control how your thread should be rendered into the model's context window. Using a truncation strategy of type `auto` will use OpenAI's default truncation strategy. Using a truncation strategy of type `last_messages` will allow you to specify the number of the most recent messages to include in the context window.
88
+
You can also specify a truncation strategy to control how your thread should be rendered into the model's context window. Using a truncation strategy of type `auto` will use OpenAI's default truncation strategy. Using a truncation strategy of type `last_messages` will allow you to specify the number of the most recent messages to include in the context window.
89
89
90
90
## See also
91
91
* Learn more about Assistants and [File Search](../how-to/file-search.md)
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/use-your-data.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -345,7 +345,7 @@ You can deploy to a standalone Teams app directly from Azure OpenAI Studio. Foll
345
345
> [!NOTE]
346
346
> The citation experience is available in **Debug (Edge)** or **Debug (Chrome)** only.
347
347
348
-
1. After you've tested your copilot, you can provision, deploy, and publish your Teams app by selecting the **Teams Toolkit Extension** on the left pane in Visual Studio Code. Run the separate provision, deploy, and publish stages in the **Lifecycle** section. You may be asked to sign in to your Microsoft 365 account where you have permissions to upload custom apps and your Azure Account.
348
+
1. After you've tested your copilot, you can provision, deploy, and publish your Teams app by selecting the **Teams Toolkit Extension** on the left pane in Visual Studio Code. Run the separate provision, deploy, and publish stages in the **Lifecycle** section. You might be asked to sign in to your Microsoft 365 account where you have permissions to upload custom apps and your Azure Account.
349
349
350
350
1. Provision your app: (detailed instructions in [Provision cloud resources](/microsoftteams/platform/toolkit/provision))
351
351
@@ -428,26 +428,26 @@ Once you select add your dataset, you can use the **System message** section in
428
428
429
429
**Define a role**
430
430
431
-
You can define a role that you want your assistant. For example, if you are building a support bot, you can add *"You are an expert incident support assistant that helps users solve new issues."*.
431
+
You can define a role that you want your assistant. For example, if you are building a support bot, you can add *"You are an expert incident support assistant that helps users solve new issues."*
432
432
433
433
**Define the type of data being retrieved**
434
434
435
435
You can also add the nature of data you are providing to assistant.
436
-
* Define the topic or scope of your dataset, like "financial report", "academic paper", or "incident report". For example, for technical support you might add *"You answer queries using information from similar incidents in the retrieved documents."*.
437
-
* If your data has certain characteristics, you can add these details to the system message. For example, if your documents are in Japanese, you can add *"You retrieve Japanese documents and you should read them carefully in Japanese and answer in Japanese."*.
438
-
* If your documents include structured data like tables from a financial report, you can also add this fact into the system prompt. For example, if your data has tables, you might add *"You are given data in form of tables pertaining to financial results and you should read the table line by line to perform calculations to answer user questions."*.
436
+
* Define the topic or scope of your dataset, like "financial report," "academic paper," or "incident report." For example, for technical support you might add *"You answer queries using information from similar incidents in the retrieved documents."*
437
+
* If your data has certain characteristics, you can add these details to the system message. For example, if your documents are in Japanese, you can add *"You retrieve Japanese documents and you should read them carefully in Japanese and answer in Japanese."*
438
+
* If your documents include structured data like tables from a financial report, you can also add this fact into the system prompt. For example, if your data has tables, you might add *"You are given data in form of tables pertaining to financial results and you should read the table line by line to perform calculations to answer user questions."*
439
439
440
440
**Define the output style**
441
441
442
-
You can also change the model's output by defining a system message. For example, if you want to ensure that the assistant answers are in French, you can add a prompt like *"You are an AI assistant that helps users who understand French find information. The user questions can be in English or French. Please read the retrieved documents carefully and answer them in French. Please translate the knowledge from documents to French to ensure all answers are in French."*.
442
+
You can also change the model's output by defining a system message. For example, if you want to ensure that the assistant answers are in French, you can add a prompt like *"You are an AI assistant that helps users who understand French find information. The user questions can be in English or French. Please read the retrieved documents carefully and answer them in French. Please translate the knowledge from documents to French to ensure all answers are in French."*
443
443
444
444
**Reaffirm critical behavior**
445
445
446
-
Azure OpenAI On Your Data works by sending instructions to a large language model in the form of prompts to answer user queries using your data. If there is a certain behavior that is critical to the application, you can repeat the behavior in system message to increase its accuracy. For example, to guide the model to only answer from documents, you can add "*Please answer using retrieved documents only, and without using your knowledge. Please generate citations to retrieved documents for every claim in your answer. If the user question cannot be answered using retrieved documents, please explain the reasoning behind why documents are relevant to user queries. In any case, don't answer using your own knowledge."*.
446
+
Azure OpenAI On Your Data works by sending instructions to a large language model in the form of prompts to answer user queries using your data. If there is a certain behavior that is critical to the application, you can repeat the behavior in system message to increase its accuracy. For example, to guide the model to only answer from documents, you can add "*Please answer using retrieved documents only, and without using your knowledge. Please generate citations to retrieved documents for every claim in your answer. If the user question cannot be answered using retrieved documents, please explain the reasoning behind why documents are relevant to user queries. In any case, don't answer using your own knowledge."*
447
447
448
448
**Prompt Engineering tricks**
449
449
450
-
There are many tricks in prompt engineering that you can try to improve the output. One example is chain-of-thought prompting where you can add *"Let’s think step by step about information in retrieved documents to answer user queries. Extract relevant knowledge to user queries from documents step by step and form an answer bottom up from the extracted information from relevant documents."*.
450
+
There are many tricks in prompt engineering that you can try to improve the output. One example is chain-of-thought prompting where you can add *"Let’s think step by step about information in retrieved documents to answer user queries. Extract relevant knowledge to user queries from documents step by step and form an answer bottom up from the extracted information from relevant documents."*
451
451
452
452
> [!NOTE]
453
453
> The system message is used to modify how GPT assistant responds to a user question based on retrieved documentation. It doesn't affect the retrieval process. If you'd like to provide instructions for the retrieval process, it is better to include them in the questions.
@@ -654,7 +654,7 @@ This means the storage account isn't accessible with the given credentials. In t
654
654
655
655
### 503 errors when sending queries with Azure AI Search
656
656
657
-
Each user message can translate to multiple search queries, all of which get sent to the search resource in parallel. This can produce throttling behavior when the number of search replicas and partitions is low. The maximum number of queries per second that a single partition and single replica can support may not be sufficient. In this case, consider increasing your replicas and partitions, or adding sleep/retry logic in your application. See the [Azure AI Search documentation](../../../search/performance-benchmarks.md) for more information.
657
+
Each user message can translate to multiple search queries, all of which get sent to the search resource in parallel. This can produce throttling behavior when the number of search replicas and partitions is low. The maximum number of queries per second that a single partition and single replica can support might not be sufficient. In this case, consider increasing your replicas and partitions, or adding sleep/retry logic in your application. See the [Azure AI Search documentation](../../../search/performance-benchmarks.md) for more information.
The links above reference the OpenAI API for Python. There is no Azure-specific OpenAI Python SDK. [Learn how to switch between the OpenAI services and Azure OpenAI services](/azure/ai-services/openai/how-to/switching-endpoints).
35
+
These links reference the OpenAI API for Python. There's no Azure-specific OpenAI Python SDK. [Learn how to switch between the OpenAI services and Azure OpenAI services](/azure/ai-services/openai/how-to/switching-endpoints).
36
36
37
37
::: zone-end
38
38
@@ -42,7 +42,7 @@ The links above reference the OpenAI API for Python. There is no Azure-specific
42
42
43
43
::: zone-end
44
44
45
-
In this quickstart you can use your own data with Azure OpenAI models. Using Azure OpenAI's models on your data can provide you with a powerful conversational AI platform that enables faster and more accurate communication.
45
+
In this quickstart, you can use your own data with Azure OpenAI models. Using Azure OpenAI's models on your data can provide you with a powerful conversational AI platform that enables faster and more accurate communication.
46
46
47
47
## Prerequisites
48
48
@@ -56,7 +56,7 @@ In this quickstart you can use your own data with Azure OpenAI models. Using Azu
56
56
57
57
::: zone pivot="programming-language-javascript"
58
58
59
-
-[LTS versions of Node.js](https://github.com/nodejs/release#release-schedule)
59
+
-[Long Term Support (LTS) versions of Node.js](https://github.com/nodejs/release#release-schedule)
0 commit comments