|
| 1 | +# RAG chat: Using agentic retrieval |
| 2 | + |
| 3 | +This repository includes an optional feature that uses agentic retrieval to find the most relevant content given a user's conversation history. |
| 4 | + |
| 5 | +## Using the feature |
| 6 | + |
| 7 | +### Supported Models |
| 8 | + |
| 9 | +See the [agentic retrieval documentation](https://learn.microsoft.com/en-us/azure/search/search-agentic-retrieval-how-to-create). |
| 10 | + |
| 11 | +### Prerequisites |
| 12 | + |
| 13 | +* A deployment of any of the supported agentic retrieval models in the [supported regions](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#standard-deployment-model-availability). If you're not sure, try to create a gpt-4o-mini deployment from your Azure OpenAI deployments page. |
| 14 | + |
| 15 | +### Deployment |
| 16 | + |
| 17 | +1. **Enable agentic retrieval:** |
| 18 | + |
| 19 | + Set the environment variables for your Azure OpenAI GPT deployments to your reasoning model |
| 20 | + |
| 21 | + |
| 22 | + ```shell |
| 23 | + azd env set USE_AGENTIC_RETRIEVAL true |
| 24 | + ``` |
| 25 | + |
| 26 | +2. **(Optional) Set the agentic retrieval model** |
| 27 | + |
| 28 | + You can configure which model agentic retrieval uses. By default, gpt-4o-mini is used |
| 29 | + |
| 30 | + For gpt-4o: |
| 31 | + ```shell |
| 32 | + azd env set AZURE_OPENAI_SEARCHAGENT_DEPLOYMENT searchagent |
| 33 | + azd env set AZURE_OPENAI_SEARCHAGENT_MODEL gpt-4o |
| 34 | + azd env set AZURE_OPENAI_SEARCHAGENT_MODEL_VERSION 2024-11-20 |
| 35 | + ``` |
| 36 | + |
| 37 | +3. **Update the infrastructure and application:** |
| 38 | + |
| 39 | + Execute `azd up` to provision the infrastructure changes (only the new model, if you ran `up` previously) and deploy the application code with the updated environment variables. |
| 40 | + |
| 41 | +4. **Try out the feature:** |
| 42 | + |
| 43 | + Open the web app and start a new chat. Agentic retrieval will be used to find all sources. |
| 44 | + |
| 45 | +5. **Experiment with max subqueries:** |
| 46 | + |
| 47 | + Select the developer options in the web app and change max subqueries to any value between 1 and 20. This controls the maximum amount of subqueries that can be created in the query plan. |
| 48 | + |
| 49 | +  |
| 50 | + |
| 51 | +6. **Review the query plan** |
| 52 | + |
| 53 | + Agentic retrieval use additional billed tokens behind the scenes for the planning process. |
| 54 | + To see the token usage, select the lightbulb icon on a chat answer. This will open the "Thought process" tab, which shows the amount of tokens used by and the queries produced by the planning process |
| 55 | + |
| 56 | +  |
0 commit comments