You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-get-started-retrieval-augmented-generation.md
+7-9Lines changed: 7 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -97,15 +97,13 @@ In this quickstart:
97
97
98
98
## Chat with your data
99
99
100
-
The playground gives you options for configuring and monitoring chat.
100
+
The playground gives you options for configuring and monitoring chat. On the right, model configuration determines which model formulates an answer using the search results from Azure AI Search. The input token progress indicator keeps track of the token count of the question you submit.
101
101
102
-
On the right, model configuration determines which model formulates an answer using the search results from Azure AI Search. The input token progress indicator keeps track of the token count of the question you submit.
102
+
Advanced settings on the left determine how much flexibility the chat model has in supplementing the grounding data, and how many chunks are provided to the model to generate its response.
103
103
104
-
On the left, advanced settings determine how much flexibility the chat model has in supplementing the grounding data, and how many chunks are provided to the model to generate its response. Strictness determines whether the model supplements its own information with the query response from Azure AI Search.
104
+
+ Strictness level 5 means no supplementation. Only your grounding data is used, which means the search engine plays a large role in the quality of the response. Semantic ranking can be helpful in this scenario because the ranking models do a better job of inferring the intent of the query. Lower levels of strictness produce more verbose answers, but might also include information that isn't in your index.
105
105
106
-
+ 5 means no supplementation. Only your grounding data is used, which means the search engine plays a large role in the quality of the response. Semantic ranking can be helpful in this scenario because the ranking models do a better job of inferring the intent of the query.
107
-
108
-
+ Lower levels of strictness produce more verbose answers, but might also include information that isn't in your index.
106
+
+ Retrieved documents are the number of matching search results used to answer the question. It's capped at 20 to minimize latency and to stay under the model input limits.
109
107
110
108
:::image type="content" source="media/search-get-started-rag/azure-openai-studio-advanced-settings.png" alt-text="Screenshot of the advanced settings.":::
111
109
@@ -119,17 +117,17 @@ On the left, advanced settings determine how much flexibility the chat model has
119
117
120
118
More complex queries, such as "why was Gettysburg important", perform better if the model has some latitude to answer (lower levels of strictness) or if semantic ranking is enabled.
121
119
122
-
Queries that require deeper analysis or language understanding, such as "how many speeches are in the vector store" or "what's in this vector store", will probably fail to return a response. In RAG pattern chat scenarios, information retrieval is keyword and similarity search against the query string, where the search engine looks for chunks having exact or similar terms, phrases, or construction. The return payload might be insufficient for handling an open-ended question.
120
+
Queries that require deeper analysis or language understanding, such as "how many speeches are in the vector store", will probably fail. Remember that the search engine looks for chunks having exact or similar terms, phrases, or construction to the query. And while the model might understand the question, if search results are chunks from speeches, it's not the right information to answer that kind of question.
123
121
124
-
Finally, chats are constrained by the number of documents (chunks) returned in the response (limited to 3-20 in Azure OpenAI Studio playground). As you can imagine, posing a question about "all of the titles" requires a full scan of the entire vector store, which means adopting an approach that allows more than 20 chunks. You could modify the generated code (assuming you [deploy the solution](/azure/ai-services/openai/use-your-data-quickstart#deploy-your-model)) to allow for [service-side exhaustive search](vector-search-how-to-create-index.md#add-a-vector-search-configuration) on your queries.
122
+
Finally, chats are constrained by the number of documents (chunks) returned in the response (limited to 3-20 in Azure OpenAI Studio playground). As you can imagine, posing a question about "all of the titles" requires a full scan of the entire vector store. You could modify the generated code (assuming you [deploy the solution](/azure/ai-services/openai/use-your-data-quickstart#deploy-your-model)) to allow for [service-side exhaustive search](vector-search-how-to-create-index.md#add-a-vector-search-configuration) on your queries.
125
123
126
124
:::image type="content" source="media/search-get-started-rag/chat-results.png" lightbox="media/search-get-started-rag/chat-results.png" alt-text="Screenshot of a chat session.":::
127
125
128
126
## Next steps
129
127
130
128
In the playground, it's easy to start over with different data and configurations and compare the results. If you didn't try **Hybrid + semantic** the first time, perhaps try again with [semantic ranking enabled](semantic-how-to-enable-disable.md).
131
129
132
-
We also providecode samples that demonstrate the full range of APIs for RAG applications. Samples are available in [Python](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-python), [C#](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet), and [JavaScript](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-javascript).
130
+
If you need customization and tuning that the playground can't provide, take a look at code samples that demonstrate the full range of APIs for RAG applications based on Azure AI Search. Samples are available in [Python](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-python), [C#](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet), and [JavaScript](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-javascript).
0 commit comments