Azure-Samples
diff --git a/‎docs/agentic_retrieval.md‎
Lines changed: 56 additions & 0 deletions b/‎docs/agentic_retrieval.md‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎docs/deploy_features.md‎
Lines changed: 7 additions & 0 deletions b/‎docs/deploy_features.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docs/images/max-subqueries.png‎
18.1 KB b/‎docs/images/max-subqueries.png‎
18.1 KB
diff --git a/‎docs/images/query-plan.png‎
40.9 KB b/‎docs/images/query-plan.png‎
40.9 KB
@@ -0,0 +1,56 @@
+# RAG chat: Using agentic retrieval
+
+This repository includes an optional feature that uses agentic retrieval to find the most relevant content given a user's conversation history.
+
+## Using the feature
+
+### Supported Models
+
+See the [agentic retrieval documentation](https://learn.microsoft.com/en-us/azure/search/search-agentic-retrieval-how-to-create).
+
+### Prerequisites
+
+* A deployment of any of the supported agentic retrieval models in the [supported regions](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#standard-deployment-model-availability). If you're not sure, try to create a gpt-4o-mini deployment from your Azure OpenAI deployments page.
+
+### Deployment
+
+1. **Enable agentic retrieval:**
+
+   Set the environment variables for your Azure OpenAI GPT deployments to your reasoning model
+
+
+   ```shell
+   azd env set USE_AGENTIC_RETRIEVAL true
+   ```
+
+2. **(Optional) Set the agentic retrieval model**
+
+   You can configure which model agentic retrieval uses. By default, gpt-4o-mini is used
+
+   For gpt-4o:
+   ```shell
+   azd env set AZURE_OPENAI_SEARCHAGENT_DEPLOYMENT searchagent
+   azd env set AZURE_OPENAI_SEARCHAGENT_MODEL gpt-4o
+   azd env set AZURE_OPENAI_SEARCHAGENT_MODEL_VERSION 2024-11-20
+   ```
+
+3. **Update the infrastructure and application:**
+
+   Execute `azd up` to provision the infrastructure changes (only the new model, if you ran `up` previously) and deploy the application code with the updated environment variables.
+
+4. **Try out the feature:**
+
+   Open the web app and start a new chat. Agentic retrieval will be used to find all sources.
+
+5. **Experiment with max subqueries:**
+
+   Select the developer options in the web app and change max subqueries to any value between 1 and 20. This controls the maximum amount of subqueries that can be created in the query plan.
+
+   ![Max subqueries screenshot](./images/max-subqueries.png)
+
+6. **Review the query plan**
+
+   Agentic retrieval use additional billed tokens behind the scenes for the planning process.
+   To see the token usage, select the lightbulb icon on a chat answer. This will open the "Thought process" tab, which shows the amount of tokens used by and the queries produced by the planning process
+
+   ![Thought process token usage](./images/query-plan.png)
@@ -128,6 +128,13 @@ This process does *not* delete your previous model deployment. If you want to de
 This feature allows you to use reasoning models to generate responses based on retrieved content. These models spend more time processing and understanding the user's request.
 To enable reasoning models, follow the steps in [the reasoning models guide](./reasoning.md).
 
+## Using agentic retrieval
+
+⚠️ This feature is not currently compatible with [vision integration](./gpt4v.md).
+
+This feature allows you to use [agentic retrieval](https://learn.microsoft.com/en-us/azure/search/search-agentic-retrieval-concept) in place of the Search API. To enable agentic retrieval, follow the steps in [the agentic retrieval guide](./agentic_retrieval.md)
+
+
 ## Using different embedding models
 
 By default, the deployed Azure web app uses the `text-embedding-3-large` embedding model. If you want to use a different embedding model, you can do so by following these steps: