You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Best practices for using Azure OpenAI On Your Data
3
3
titleSuffix: Azure OpenAI Service
4
-
description: Learn about the best practices for using Azure OpenAI On Your Data.
4
+
description: Learn about the best practices for using Azure OpenAI On Your Data, along with how to fix common problems.
5
5
ms.service: azure-ai-openai
6
6
ms.topic: conceptual
7
7
ms.date: 04/08/2024
@@ -20,22 +20,22 @@ This article can help guide you through the common problems in developing a solu
20
20
21
21
The workflow for Azure OpenAI On Your Data has two major parts:
22
22
23
-
***Data ingestion**: This is the stage where you connect your data with Azure OpenAI On Your Data. In this stage, user documents are processed and broken down into smaller chunks (1,024 tokens by default, but there are more chunking options available) and then indexed.
23
+
***Data ingestion**: This is the stage where you connect your data with Azure OpenAI On Your Data. In this stage, user documents are processed and broken down into smaller chunks and then indexed. The chunks are 1,024 tokens by default, but more chunking options are available.
24
24
25
-
This is the stage where you can choose an embedding model to use for creation of embeddings or preferred search type. Embeddings are representations of values or objects (like text, images, and audio) that are designed to be consumed by machine learning models and semantic search algorithms.
25
+
Also in this stage, you can choose an embedding model to use for creation of embeddings or preferred search type. Embeddings are representations of values or objects (like text, images, and audio) that are designed to be consumed by machine learning models and semantic search algorithms.
26
26
27
-
The output of this process is an index that will later be used to retrieve documents from during inference.
27
+
The output of this process is an index that will later be used for retrieving documents during inference.
28
28
29
-
***Inferencing**: This is the stage where users chat with their data by using a studio, deployed web app, or direct API calls. In this stage, users can set various model parameters (such as `temperature`, or`top_P` ) and system parameters (such as `strictness` and `topNDocuments`).
29
+
***Inferencing**: This is the stage where users chat with their data by using a studio, a deployed web app, or direct API calls. In this stage, users can set various model parameters (such as `temperature` and`top_P` ) and system parameters (such as `strictness` and `topNDocuments`).
30
30
31
31
Think of ingestion as a separate process before inferencing. After the index is created, Azure OpenAI On Your Data goes through the following steps to generate a good response to user questions:
32
32
33
33
1.**Intent generation**: Azure OpenAI On Your Data generates multiple search intents by using user questions and conversation history. It generates multiple search intents to address any ambiguity in the users' questions, add more context by using the conversation history to retrieve holistic information in the retrieval stage, and provide any additional information to make the final response thorough and useful.
34
34
2.**Retrieval**: By using the search type provided during the ingestion, Azure OpenAI On Your Data retrieves a list of relevant document chunks that correspond to each of the search intents.
35
35
3.**Filtration**: Azure OpenAI On Your Data uses the strictness setting to filter out the retrieved documents that are considered irrelevant according to the strictness threshold. The `strictness` parameter controls how aggressive the filtration is.
36
36
4.**Re-ranking**: Azure OpenAI On Your Data re-ranks the remaining document chunks retrieved for each of the search intents. The purpose of re-ranking is to produce a combined list of the most relevant documents retrieved for all search intents.
37
-
5.**TopNDocuments**: The `topNDocuments` parameter from this reranked list is included in the prompt sent to the model, along with the question, the conversation history, and the role information/system message.
38
-
6.**Response Generation**: The model uses the provided context to generate the final response along with citations.
37
+
5.**Parameter inclusion**: The `topNDocuments` parameter from the re-ranked list is included in the prompt sent to the model, along with the question, the conversation history, and the role information or system message.
38
+
6.**Response generation**: The model uses the provided context to generate the final response along with citations.
39
39
40
40
## How to structure debugging investigation
41
41
@@ -45,7 +45,7 @@ When you see an unfavorable response to a query, it might be the result of diffe
45
45
46
46
Use the REST API to check if the correct document chunks are present in the retrieved documents. In the API response, check the citations in the `tool` message.
47
47
48
-
### Step 2: Check for generation problem
48
+
### Step 2: Check for generation problems
49
49
50
50
If the correct document chunks appear in the retrieved documents, you're likely encountering a problem with content generation. Consider using a more powerful model through one of these methods:
51
51
@@ -56,76 +56,78 @@ You can also tune the finer aspects of the response by changing the role informa
56
56
57
57
### Step 3: Check the rest of the funnel
58
58
59
-
If the correct document chunks don't appear in the retrieved documents, you need to dig further down the funnel:
59
+
If the correct document chunks don't appear in the retrieved documents, you need to dig farther down the funnel:
60
60
61
-
* It's possible that a correct document chunk was retrieved but was filtered out based on `strictness`. In this case, try reducing the `strictness` parameter.
61
+
* It's possible that a correct document chunk was retrieved but was filtered out based on strictness. In this case, try reducing the `strictness` parameter.
62
62
63
-
* It's possible that a correct document chunk wasn't part of the `topNDocuments`paramater. In this case, increase the parameter.
63
+
* It's possible that a correct document chunk wasn't part of the `topNDocuments`parameter. In this case, increase the parameter.
64
64
65
-
* It's possible that your index fields are not correctly mapped, so retrieval might not work well. This mapping is particularly relevant if you're using a pre-existing data source (that is, you didn't create the index by using the studio or offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts). For more information on mapping index fields, see the [how-to article](../concepts/use-your-data.md?tabs=ai-search#index-field-mapping).
65
+
* It's possible that your index fields are incorrectly mapped, so retrieval might not work well. This mapping is particularly relevant if you're using a pre-existing data source. (That is, you didn't create the index by using the studio or offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts).) For more information on mapping index fields, see the [how-to article](../concepts/use-your-data.md?tabs=ai-search#index-field-mapping).
66
66
67
67
* It's possible that the intent generation step isn't working well. In the API response, check the `intents` fields in the `tool` message.
68
68
69
-
Some models are known to not work well for intent generation. For example, if you're using the `GPT-35-turbo-1106` model version, consider using a later model, such as `gpt-35-turbo` (0125) or `GPT-4-1106-preview`.
69
+
Some models don't work well for intent generation. For example, if you're using the `GPT-35-turbo-1106` model version, consider using a later model, such as `gpt-35-turbo` (0125) or `GPT-4-1106-preview`.
70
70
71
-
* Do you have semistructured data in your documents, such as numerous tables? There might be an ingestion problem. Your data might need special handling during ingestion:
71
+
* Do you have semistructured data in your documents, such as numerous tables? There might be an ingestion problem. Your data might need special handling during ingestion.
72
72
73
-
* If the file format is PDF, we offer optimized ingestion for tables using the offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts). to use the scripts, you need to have a [Document Intelligence](../../document-intelligence/overview.md) resource and use the `Layout`[model](../../document-intelligence/concept-layout.md). You can also:
74
-
* Adjust your chunk size to make sure your largest table fits within the specified [chunk size](../concepts/use-your-data.md#chunk-size-preview).
73
+
* If the file format is PDF, we offer optimized ingestion for tables by using the offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts). To use the scripts, you need to have a [Document Intelligence](../../document-intelligence/overview.md) resource and use the [layout model](../../document-intelligence/concept-layout.md).
74
+
75
+
* You can adjust your [chunk size](../concepts/use-your-data.md#chunk-size-preview) to make sure your largest table fits within it.
75
76
76
-
* Are you converting a semistructured data type such as json/xml to a PDF document? This might cause an **ingestion issue** because structured information needs a chunking strategy that is different from purely text content.
77
+
* Are you converting a semistructured data type, such as JSON or XML, to a PDF document? This conversion might cause an ingestion problem because structured information needs a chunking strategy that's different from purely text content.
77
78
78
-
* If none of the above apply, you might be encountering a **retrieval issue**. Consider using a more powerful `query_type`. Based on our benchmarking, `semantic` and `vectorSemanticHybrid` are preferred.
79
+
* If none of the preceding items apply, you might be encountering a retrieval problem. Consider using a more powerful `query_type` value. Based on our benchmarking, `semantic` and `vectorSemanticHybrid` are preferred.
79
80
80
81
## Common problems
81
82
82
-
**Issue 1**: _The model responds with "The requested information isn't present in the retrieved documents. Please try a different query or topic" even though that's not the case._
83
+
The following sections list possible solutions to problems that you might encounter when you're developing a solution by using Azure OpenAI Service On Your Data.
83
84
84
-
See [Step 3](#step-3-check-the-rest-of-the-funnel)in the above debugging process.
85
+
### The information is correct, but the model responds with "The requested information isn't present in the retrieved documents. Please try a different query or topic."
85
86
86
-
**Issue 2**: _The response is from my data, but it isn't relevant/correct in the context of the question._
87
+
See [step 3](#step-3-check-the-rest-of-the-funnel)in the preceding debugging process.
87
88
88
-
See the debugging process starting at [Step 1](#step-1-check-for-retrieval-issues).
89
+
### A response is from your data, but it isn't relevant or correct in the context of the question
89
90
90
-
**Issue 3**: _The role information / system message isn't being followed by the model._
91
+
See the preceding debugging process, starting at [step 1](#step-1-check-for-retrieval-problems).
91
92
92
-
* Instructions in the role information might contradict with our [Responsible AI guidelines](/legal/cognitive-services/openai/overview?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext), in which case it won't likely be followed.
93
+
### The model isn't following the role information or system message
93
94
94
-
*For each model, there is an implicit token limit for the role information, beyond which it is truncated. Ensure your role information follows the established [limits](../concepts/use-your-data.md#token-usage-estimation-for-azure-openai-on-your-data).
95
+
*Make sure that instructions in the role information are consistent with the [Responsible AI guidelines](/legal/cognitive-services/openai/overview?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext). The model likely won't follow role information if it contradicts those guidelines.
95
96
96
-
*A prompt engineering technique you can use is to repeat an important instruction at the end of the prompt. Surrounding the important instruction with `**` on both side of it can also help.
97
+
*Ensure that your role information follows the [established limits](../concepts/use-your-data.md#token-usage-estimation-for-azure-openai-on-your-data) for it. Each model has an implicit token limit for the role information. Beyond that limit, the information is truncated.
97
98
98
-
*Upgrade to a newer GPT-4 model as it's likely to follow your instructions better than GPT-35.
99
+
*Use the prompt engineering technique of repeating an important instruction at the end of the prompt. Putting a double asterisk (`**`) on both sides of the important information can also help.
99
100
100
-
**Issue 4**: _There are inconsistencies in responses._
101
+
* Upgrade to a newer GPT-4 model, because it's likely to follow your instructions better than GPT-3.5.
101
102
102
-
* Ensure you're using a low `temperature`. We recommend setting it to `0`.
103
+
### Responses have inconsistencies
103
104
104
-
*Although the question is the same, the conversation history gets added to the context and affects how the model responds to same question over a long session.
105
+
*Ensure that you're using a low `temperature` value. We recommend setting it to `0`.
105
106
106
-
*Using the REST API, check if the search intents generated are the same both times or not. If they are very different, try a more powerful model such as GPT-4 to see if the problem is affected by the chosen model.
107
+
*By using the REST API, check if the generated search intents are the same both times. If the intents are different, try a more powerful model such as GPT-4 to see if the chosen model affects the problem. If the intents are the same or similar, try reducing `strictness` or increasing `topNDocuments`.
107
108
108
-
* If the intents are same or similar, try reducing `strictness` or increasing `topNDocuments`.
109
+
> [!NOTE]
110
+
> Although the question is the same, the conversation history is added to the context and affects how the model responds to the same question over a long session.
109
111
110
-
**Issue 5**: _Intents are empty or wrong._
112
+
### Intents are empty or wrong
111
113
112
-
* Refer to [Step 3](#step-3-check-the-rest-of-the-funnel) in the above debugging process.
114
+
* Refer to [Step 3](#step-3-check-the-rest-of-the-funnel) in the preceding debugging process.
113
115
114
-
* If intents are irrelevant, the issue might be that the intent generation step lacks context. It only considers the user question and conversation history. It does not look at the role information or the document chunks. You might want to consider adding a prefix to each user question with a short context string to help the intent generation step.
116
+
* If intents are irrelevant, the problem might be that the intent generation step lacks context. Intent generation considers only the user question and conversation history. It doesn't consider the role information or the document chunks. You might consider adding a prefix to each user question with a short context string to help the intent generation step.
115
117
116
-
**Issue 6**: _I have set inScope=true or checked "Restrict responses to my data" but it still responds to Out-Of-Domain questions._
118
+
### You set inScope=true or selected the checkbox for restricting responses to data, but the model still responds to out-of-domain questions
117
119
118
120
* Consider increasing `strictness`.
119
121
120
-
* Add the following instruction in your role information / system message:
122
+
* Add the following instruction in your role information or system message:
121
123
122
-
"You are also allowed to respond to questions based on the retrieved documents."
123
-
*The `inscope` parameter isn't a hard switch, but setting it to `true` encourages the model to stay restricted.
124
+
`You are also allowed to respond to questions based on the retrieved documents.`
125
+
*Set the `inScope` parameter to `true`. The parameter isn't a hard switch, but setting it to `true` encourages the model to stay restricted.
124
126
125
-
**Issue 7**: _The response is correct but occasionally missing document references/citations._
127
+
### A response is correct but is occasionally missing document references or citations
126
128
127
129
* Consider upgrading to a GPT-4 model if you're already not using it. GPT-4 is generally more consistent with citation generation.
128
130
129
-
*You can try to emphasize citation generation in the response by adding `**You must generate citation based on the retrieved documents in the response**` in the role information.
131
+
*Try to emphasize citation generation in the response by adding `You must generate citation based on the retrieved documents in the response` in the role information.
130
132
131
-
*Or you can add a prefix in the user query `**You must generate citation to the retrieved documents in the response to the user question \n User Question: {actual user question}**`
133
+
*Try adding a prefix in the user query `You must generate citation to the retrieved documents in the response to the user question \n User Question: {actual user question}`.
0 commit comments