MicrosoftDocs
diff --git a/‎articles/search/media/tutorial-rag-solution/chat-results-filter.png‎
114 KB b/‎articles/search/media/tutorial-rag-solution/chat-results-filter.png‎
114 KB
diff --git a/‎articles/search/tutorial-rag-build-solution-query.md‎
Lines changed: 25 additions & 2 deletions b/‎articles/search/tutorial-rag-build-solution-query.md‎
Lines changed: 25 additions & 2 deletions
@@ -123,7 +123,30 @@ Results from the first query`"how much of earth is covered by water"` should loo
 
 :::image type="content" source="media/tutorial-rag-solution/chat-results-1.png" alt-text="Screenshot of an LLM response to a simple question using a single match from search results.":::
 
-### Changing the inputs
+It's common for LLMs to return different answers, even if the prompt and queries are unchanged. Your result might look very different from the example.
+
+> [!NOTE]
+> In testing this tutorial, we saw a variety of responses, some more relevant than others. A few times, repeating the same request caused a deterioration in the response, most likely due to confusion in the chat history, possibly with the model registering the repeated requests as dissatisfaction with the generated answer. Managing chat history is out of scope for this tutorial, but including it in your application code should mitigate or even eliminate this behavior.
+
+## Add a filter
+
+Recall that you created a `locations` field using applied AI, populated with places recognized by the Entity Recognition skill. The field definition for locations includes the `filterable` attribute. Let's repeat the previous request, but this time adding a filter that selects on the term *ice* in the locations field.
+
+Replace the search_results definition with the following example that includes a filter:
+
+```python
+search_results = search_client.search(
+    search_text=query,
+    top=10,
+    filter="search.ismatch('ice*', 'locations', 'full', 'any')",
+    select="title, chunk, locations"
+```
+
+Results from the filtered query should now look similar to the following response.
+
+:::image type="content" source="media/tutorial-rag-solution/chat-results-filter.png" alt-text="Screenshot of an LLM response after a filter is added.":::
+
+## Change the inputs
 
 Increasing or decreasing the number of inputs to the LLM can have a large effect on the response. Try running the same query again after setting `top=3`. When you increase the inputs, the model returns different results each time, even if the query doesn't change. 
 
@@ -133,7 +156,7 @@ Here's one example of what the model returns after increasing the inputs to 3.
 
 Because the model is bound to just the grounding data, the answer becomes more expansive as you increase size of the input. You can use relevance tuning to potentially generate more focused answers.
 
-### Changing the prompt
+## Change the prompt
 
 You can also change the prompt to control the format of the output, tone, and whether you want the model to supplement the answer with its own training data by changing the prompt.