Skip to content

Commit dd041f8

Browse files
committed
edit pass: azure-openai-on-your-data
1 parent 2a92e14 commit dd041f8

File tree

1 file changed

+44
-42
lines changed

1 file changed

+44
-42
lines changed
Lines changed: 44 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Best practices for using Azure OpenAI On Your Data
33
titleSuffix: Azure OpenAI Service
4-
description: Learn about the best practices for using Azure OpenAI On Your Data.
4+
description: Learn about the best practices for using Azure OpenAI On Your Data, along with how to fix common problems.
55
ms.service: azure-ai-openai
66
ms.topic: conceptual
77
ms.date: 04/08/2024
@@ -20,22 +20,22 @@ This article can help guide you through the common problems in developing a solu
2020

2121
The workflow for Azure OpenAI On Your Data has two major parts:
2222

23-
* **Data ingestion**: This is the stage where you connect your data with Azure OpenAI On Your Data. In this stage, user documents are processed and broken down into smaller chunks (1,024 tokens by default, but there are more chunking options available) and then indexed.
23+
* **Data ingestion**: This is the stage where you connect your data with Azure OpenAI On Your Data. In this stage, user documents are processed and broken down into smaller chunks and then indexed. The chunks are 1,024 tokens by default, but more chunking options are available.
2424

25-
This is the stage where you can choose an embedding model to use for creation of embeddings or preferred search type. Embeddings are representations of values or objects (like text, images, and audio) that are designed to be consumed by machine learning models and semantic search algorithms.
25+
Also in this stage, you can choose an embedding model to use for creation of embeddings or preferred search type. Embeddings are representations of values or objects (like text, images, and audio) that are designed to be consumed by machine learning models and semantic search algorithms.
2626

27-
The output of this process is an index that will later be used to retrieve documents from during inference.
27+
The output of this process is an index that will later be used for retrieving documents during inference.
2828

29-
* **Inferencing**: This is the stage where users chat with their data by using a studio, deployed web app, or direct API calls. In this stage, users can set various model parameters (such as `temperature`, or `top_P` ) and system parameters (such as `strictness` and `topNDocuments`).
29+
* **Inferencing**: This is the stage where users chat with their data by using a studio, a deployed web app, or direct API calls. In this stage, users can set various model parameters (such as `temperature` and `top_P` ) and system parameters (such as `strictness` and `topNDocuments`).
3030

3131
Think of ingestion as a separate process before inferencing. After the index is created, Azure OpenAI On Your Data goes through the following steps to generate a good response to user questions:
3232

3333
1. **Intent generation**: Azure OpenAI On Your Data generates multiple search intents by using user questions and conversation history. It generates multiple search intents to address any ambiguity in the users' questions, add more context by using the conversation history to retrieve holistic information in the retrieval stage, and provide any additional information to make the final response thorough and useful.
3434
2. **Retrieval**: By using the search type provided during the ingestion, Azure OpenAI On Your Data retrieves a list of relevant document chunks that correspond to each of the search intents.
3535
3. **Filtration**: Azure OpenAI On Your Data uses the strictness setting to filter out the retrieved documents that are considered irrelevant according to the strictness threshold. The `strictness` parameter controls how aggressive the filtration is.
3636
4. **Re-ranking**: Azure OpenAI On Your Data re-ranks the remaining document chunks retrieved for each of the search intents. The purpose of re-ranking is to produce a combined list of the most relevant documents retrieved for all search intents.
37-
5. **TopNDocuments**: The `topNDocuments` parameter from this reranked list is included in the prompt sent to the model, along with the question, the conversation history, and the role information/system message.
38-
6. **Response Generation**: The model uses the provided context to generate the final response along with citations.
37+
5. **Parameter inclusion**: The `topNDocuments` parameter from the re-ranked list is included in the prompt sent to the model, along with the question, the conversation history, and the role information or system message.
38+
6. **Response generation**: The model uses the provided context to generate the final response along with citations.
3939

4040
## How to structure debugging investigation
4141

@@ -45,7 +45,7 @@ When you see an unfavorable response to a query, it might be the result of diffe
4545

4646
Use the REST API to check if the correct document chunks are present in the retrieved documents. In the API response, check the citations in the `tool` message.
4747

48-
### Step 2: Check for generation problem
48+
### Step 2: Check for generation problems
4949

5050
If the correct document chunks appear in the retrieved documents, you're likely encountering a problem with content generation. Consider using a more powerful model through one of these methods:
5151

@@ -56,76 +56,78 @@ You can also tune the finer aspects of the response by changing the role informa
5656

5757
### Step 3: Check the rest of the funnel
5858

59-
If the correct document chunks don't appear in the retrieved documents, you need to dig further down the funnel:
59+
If the correct document chunks don't appear in the retrieved documents, you need to dig farther down the funnel:
6060

61-
* It's possible that a correct document chunk was retrieved but was filtered out based on `strictness`. In this case, try reducing the `strictness` parameter.
61+
* It's possible that a correct document chunk was retrieved but was filtered out based on strictness. In this case, try reducing the `strictness` parameter.
6262

63-
* It's possible that a correct document chunk wasn't part of the `topNDocuments` paramater. In this case, increase the parameter.
63+
* It's possible that a correct document chunk wasn't part of the `topNDocuments` parameter. In this case, increase the parameter.
6464

65-
* It's possible that your index fields are not correctly mapped, so retrieval might not work well. This mapping is particularly relevant if you're using a pre-existing data source (that is, you didn't create the index by using the studio or offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts). For more information on mapping index fields, see the [how-to article](../concepts/use-your-data.md?tabs=ai-search#index-field-mapping).
65+
* It's possible that your index fields are incorrectly mapped, so retrieval might not work well. This mapping is particularly relevant if you're using a pre-existing data source. (That is, you didn't create the index by using the studio or offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts).) For more information on mapping index fields, see the [how-to article](../concepts/use-your-data.md?tabs=ai-search#index-field-mapping).
6666

6767
* It's possible that the intent generation step isn't working well. In the API response, check the `intents` fields in the `tool` message.
6868

69-
Some models are known to not work well for intent generation. For example, if you're using the `GPT-35-turbo-1106` model version, consider using a later model, such as `gpt-35-turbo` (0125) or `GPT-4-1106-preview`.
69+
Some models don't work well for intent generation. For example, if you're using the `GPT-35-turbo-1106` model version, consider using a later model, such as `gpt-35-turbo` (0125) or `GPT-4-1106-preview`.
7070

71-
* Do you have semistructured data in your documents, such as numerous tables? There might be an ingestion problem. Your data might need special handling during ingestion:
71+
* Do you have semistructured data in your documents, such as numerous tables? There might be an ingestion problem. Your data might need special handling during ingestion.
7272

73-
* If the file format is PDF, we offer optimized ingestion for tables using the offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts). to use the scripts, you need to have a [Document Intelligence](../../document-intelligence/overview.md) resource and use the `Layout` [model](../../document-intelligence/concept-layout.md). You can also:
74-
* Adjust your chunk size to make sure your largest table fits within the specified [chunk size](../concepts/use-your-data.md#chunk-size-preview).
73+
* If the file format is PDF, we offer optimized ingestion for tables by using the offline scripts available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/tree/main/scripts). To use the scripts, you need to have a [Document Intelligence](../../document-intelligence/overview.md) resource and use the [layout model](../../document-intelligence/concept-layout.md).
74+
75+
* You can adjust your [chunk size](../concepts/use-your-data.md#chunk-size-preview) to make sure your largest table fits within it.
7576

76-
* Are you converting a semistructured data type such as json/xml to a PDF document? This might cause an **ingestion issue** because structured information needs a chunking strategy that is different from purely text content.
77+
* Are you converting a semistructured data type, such as JSON or XML, to a PDF document? This conversion might cause an ingestion problem because structured information needs a chunking strategy that's different from purely text content.
7778

78-
* If none of the above apply, you might be encountering a **retrieval issue**. Consider using a more powerful `query_type`. Based on our benchmarking, `semantic` and `vectorSemanticHybrid` are preferred.
79+
* If none of the preceding items apply, you might be encountering a retrieval problem. Consider using a more powerful `query_type` value. Based on our benchmarking, `semantic` and `vectorSemanticHybrid` are preferred.
7980

8081
## Common problems
8182

82-
**Issue 1**: _The model responds with "The requested information isn't present in the retrieved documents. Please try a different query or topic" even though that's not the case._
83+
The following sections list possible solutions to problems that you might encounter when you're developing a solution by using Azure OpenAI Service On Your Data.
8384

84-
See [Step 3](#step-3-check-the-rest-of-the-funnel) in the above debugging process.
85+
### The information is correct, but the model responds with "The requested information isn't present in the retrieved documents. Please try a different query or topic."
8586

86-
**Issue 2**: _The response is from my data, but it isn't relevant/correct in the context of the question._
87+
See [step 3](#step-3-check-the-rest-of-the-funnel) in the preceding debugging process.
8788

88-
See the debugging process starting at [Step 1](#step-1-check-for-retrieval-issues).
89+
### A response is from your data, but it isn't relevant or correct in the context of the question
8990

90-
**Issue 3**: _The role information / system message isn't being followed by the model._
91+
See the preceding debugging process, starting at [step 1](#step-1-check-for-retrieval-problems).
9192

92-
* Instructions in the role information might contradict with our [Responsible AI guidelines](/legal/cognitive-services/openai/overview?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext), in which case it won't likely be followed.
93+
### The model isn't following the role information or system message
9394

94-
* For each model, there is an implicit token limit for the role information, beyond which it is truncated. Ensure your role information follows the established [limits](../concepts/use-your-data.md#token-usage-estimation-for-azure-openai-on-your-data).
95+
* Make sure that instructions in the role information are consistent with the [Responsible AI guidelines](/legal/cognitive-services/openai/overview?context=%2Fazure%2Fai-services%2Fopenai%2Fcontext%2Fcontext). The model likely won't follow role information if it contradicts those guidelines.
9596

96-
* A prompt engineering technique you can use is to repeat an important instruction at the end of the prompt. Surrounding the important instruction with `**` on both side of it can also help.
97+
* Ensure that your role information follows the [established limits](../concepts/use-your-data.md#token-usage-estimation-for-azure-openai-on-your-data) for it. Each model has an implicit token limit for the role information. Beyond that limit, the information is truncated.
9798

98-
* Upgrade to a newer GPT-4 model as it's likely to follow your instructions better than GPT-35.
99+
* Use the prompt engineering technique of repeating an important instruction at the end of the prompt. Putting a double asterisk (`**`) on both sides of the important information can also help.
99100

100-
**Issue 4**: _There are inconsistencies in responses._
101+
* Upgrade to a newer GPT-4 model, because it's likely to follow your instructions better than GPT-3.5.
101102

102-
* Ensure you're using a low `temperature`. We recommend setting it to `0`.
103+
### Responses have inconsistencies
103104

104-
* Although the question is the same, the conversation history gets added to the context and affects how the model responds to same question over a long session.
105+
* Ensure that you're using a low `temperature` value. We recommend setting it to `0`.
105106

106-
* Using the REST API, check if the search intents generated are the same both times or not. If they are very different, try a more powerful model such as GPT-4 to see if the problem is affected by the chosen model.
107+
* By using the REST API, check if the generated search intents are the same both times. If the intents are different, try a more powerful model such as GPT-4 to see if the chosen model affects the problem. If the intents are the same or similar, try reducing `strictness` or increasing `topNDocuments`.
107108

108-
* If the intents are same or similar, try reducing `strictness` or increasing `topNDocuments`.
109+
> [!NOTE]
110+
> Although the question is the same, the conversation history is added to the context and affects how the model responds to the same question over a long session.
109111
110-
**Issue 5**: _Intents are empty or wrong._
112+
### Intents are empty or wrong
111113

112-
* Refer to [Step 3](#step-3-check-the-rest-of-the-funnel) in the above debugging process.
114+
* Refer to [Step 3](#step-3-check-the-rest-of-the-funnel) in the preceding debugging process.
113115

114-
* If intents are irrelevant, the issue might be that the intent generation step lacks context. It only considers the user question and conversation history. It does not look at the role information or the document chunks. You might want to consider adding a prefix to each user question with a short context string to help the intent generation step.
116+
* If intents are irrelevant, the problem might be that the intent generation step lacks context. Intent generation considers only the user question and conversation history. It doesn't consider the role information or the document chunks. You might consider adding a prefix to each user question with a short context string to help the intent generation step.
115117

116-
**Issue 6**: _I have set inScope=true or checked "Restrict responses to my data" but it still responds to Out-Of-Domain questions._
118+
### You set inScope=true or selected the checkbox for restricting responses to data, but the model still responds to out-of-domain questions
117119

118120
* Consider increasing `strictness`.
119121

120-
* Add the following instruction in your role information / system message:
122+
* Add the following instruction in your role information or system message:
121123

122-
"You are also allowed to respond to questions based on the retrieved documents."
123-
* The `inscope` parameter isn't a hard switch, but setting it to `true` encourages the model to stay restricted.
124+
`You are also allowed to respond to questions based on the retrieved documents.`
125+
* Set the `inScope` parameter to `true`. The parameter isn't a hard switch, but setting it to `true` encourages the model to stay restricted.
124126

125-
**Issue 7**: _The response is correct but occasionally missing document references/citations._
127+
### A response is correct but is occasionally missing document references or citations
126128

127129
* Consider upgrading to a GPT-4 model if you're already not using it. GPT-4 is generally more consistent with citation generation.
128130

129-
* You can try to emphasize citation generation in the response by adding `**You must generate citation based on the retrieved documents in the response**` in the role information.
131+
* Try to emphasize citation generation in the response by adding `You must generate citation based on the retrieved documents in the response` in the role information.
130132

131-
* Or you can add a prefix in the user query `**You must generate citation to the retrieved documents in the response to the user question \n User Question: {actual user question}**`
133+
* Try adding a prefix in the user query `You must generate citation to the retrieved documents in the response to the user question \n User Question: {actual user question}`.

0 commit comments

Comments
 (0)