You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-agentic-retrieval-concept.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,11 +56,11 @@ Agentic retrieval has these components:
56
56
|-----------|----------|-------|
57
57
| LLM (gpt-4o and gpt-4.1 series) | Azure OpenAI | Formulates subqueries for the query plan. You can use these models for other downstream operations. Specifically, you can send the unified response string to one of these models and ask it ground its answer on the string. |
58
58
| Search index | Azure AI Search | Contains plain text and vector content, a semantic configuration, and other elements as needed. |
59
-
|Agent | Azure AI Search | Connects to your model, providing parameters and inputs to build a query plan. |
59
+
|Search agent | Azure AI Search | Connects to your LLM, providing parameters and inputs to build a query plan. |
60
60
| Retrieval engine | Azure AI Search | Executes on the LLM-generated query plan and other parameters, returning a rich response that includes content and query plan metadata. Queries are keyword, vector, and hybrid. Results are merged and ranked. |
61
61
| Semantic ranker | Azure AI Search | Provides L2 reranking, promoting the most relevant matches. Semantic ranker is required for agentic retrieval. |
62
62
63
-
Your solution should include a tool or app that drives the pipeline. An agentic retrieval pipeline concludes with the response object that provides grounding data. Your solution should handle the response, including passing it to an LLM to generate an answer, which you render inline in the user conversation. For more information about this step, see [Build an agent-to-agent retrieval solution](search-agentic-retrieval-how-to-pipeline.md).
63
+
Your solution should include a tool or app that drives the pipeline. An agentic retrieval pipeline concludes with the response object that provides grounding data. Your solution should take it from there, handling the response by passing it to an LLM to generate an answer, which you render inline in the user conversation. For more information about this step, see [Build an agent-to-agent retrieval solution](search-agentic-retrieval-how-to-pipeline.md).
64
64
65
65
<!-- Insert multiquery pipeline diagram here -->
66
66
Agentic retrieval has these processes:
@@ -143,15 +143,17 @@ You must use the REST APIs or a prerelease Azure SDK page that provides the func
143
143
Choose any of these options for your next step.
144
144
145
145
<!-- + Watch this demo. -->
146
-
+ Quickstart. Learn the basic workflow using sample data and a prepared index and queries.
146
+
+[Quickstart](search-get-started-agentic-retrieval.md). Learn the basic workflow using sample data and a prepared index and queries.
147
+
148
+
+[(Sample code) Build an agentic retrieval pipeline using Azure AI Search and Azure AI Agent in the Foundry portal](https://github.com/Azure-Samples/azure-search-python-samples/agent-example)
147
149
148
150
+ How-to guides for a closer look at building an agentic retrieval pipeline:
149
151
150
152
+[Create an agent](search-agentic-retrieval-how-to-create.md)
151
153
+[Use an agent to retrieve data](search-agentic-retrieval-how-to-retrieve.md)
152
154
+[Build an agent-to-agent retrieval solution](search-agentic-retrieval-how-to-pipeline.md).
153
155
154
-
+ REST API reference, Agents.
156
+
+ REST API reference, [Agents](/rest/api/searchservice/knowledge-agents/create-or-update?view=rest-searchservice-2025-05-01-preview&preserve-view=true) and [retrieve](/rest/api/searchservice/knowledge-retrieval/retrieve?view=rest-searchservice-2025-05-01-preview&preserve-view=true).
155
157
156
158
+[Azure OpenAI Demo](https://github.com/Azure-Samples/azure-search-openai-demo), updated to use agentic retrieval.
This article describes an approach or pattern for building a solution that uses Azure AI Search for data retrieval and how to integrate the retrieval into a custom solution.
18
+
This article describes an approach or pattern for building a solution that uses Azure AI Search for data retrieval and how to integrate the retrieval into a custom solution that includes Azure AI Agent.
19
+
20
+
This article supports the [agent-example](https://github.com/Azure-Samples/azure-search-python-samples/agent-example
21
+
) Python sample on GitHub.
22
+
23
+
This exercise differs from the [Agentic Retrieval Quickstart](search-get-started-agentic-retrieval.md) in how it uses Azure AI Agent to determine whether to retrieve data from the index, and how it uses an agent tool for orchestration.
24
+
25
+
## Prerequisites
26
+
27
+
The following resources are required for this design pattern:
28
+
29
+
+ Azure AI Search, basic tier or higher, in a [region that provides semantic ranker](search-region-support.md).
30
+
31
+
+ A search index that satisfies the [index criteria for agentic retrieval](search-agentic-retrieval-how-to-index.md).
32
+
33
+
+ Azure OpenAI, and you should have an **Azure AI Developer** role assignment to create a Foundry project.
34
+
35
+
+ A project in Azure AI Foundry, with a deployment of a supported large language model and an Azure AI Agent in a basic setup. To meet this requirement, follow the steps in [Quickstart: Create a new agent (Preview)](/azure/ai-services/agents/quickstart?pivots=ai-foundry-portal). We recommend 100,000 token capacity for your model. You can find capacity and the rate limit in the model deployments list in the Azure AI Foundry portal.
36
+
37
+
### Supported large language models
38
+
39
+
Use Azure OpenAI or an equivalent open source model:
40
+
41
+
+`gpt-4o`
42
+
+`gpt-4o-mini`
43
+
+`gpt-4.1`
44
+
+`gpt-4.1-nano`
45
+
+`gpt-4.1-mini`
19
46
20
47
## Development tasks
21
48
22
49
Development tasks on the Azure AI Search side include:
23
50
24
-
+ Create an agent on Azure AI Search that maps to your deployed model in Azure AI Foundry Model.
51
+
+ Create a search agent on Azure AI Search that maps to your deployed model in Azure AI Foundry Model.
25
52
+ Call the retriever and provide a query, conversation, and override parameters.
26
-
+ Parse the response for the parts you want to include in your chat application. For many scenarios, just the content portion of the response is sufficient.
53
+
+ Parse the response for the parts you want to include in your chat application. For many scenarios, just the content portion of the response is sufficient.
27
54
28
55
## Components of the solution
29
56
30
-
Your custom application makes API calls to Azure AI Search and Azure SDK.
57
+
Your custom application makes API calls to Azure AI Search and an Azure SDK.
31
58
32
59
+ External data from anywhere
33
60
+ Azure AI Search, hosting indexed data and the agentic data retrieval engine
34
61
+ Azure AI Foundry Model, providing a chat model (an LLM) for user interaction
35
62
+ Azure SDK with a Foundry project, providing programmatic access to chat and chat history
63
+
+ Azure AI Agent, with an agent for handling the conversation, and a tool for orchestration
64
+
65
+
## How to customize grounding data
36
66
37
-
Agentic retrieval on Azure AI Search calls:
67
+
Search results are consolidating into a large unified string that you can pass to a conversational language model for a grounded answer. The following indexing and relevance tuning features in Azure AI Search are available to help you generate high quality results:
38
68
39
-
+LLM on Azure AI Foundry Model for query planning
69
+
+Scoring profiles (added to your search index) provide built-in boosting criteria. Your index must specify a default scoring profile, and that's the one used by the retrieval engine when queries include fields associated with that profile.
40
70
41
-
<!-- ## Setting up Azure AI Agent service
71
+
+ Semantic configuration is required, but you determine which fields are prioritized and used for ranking.
42
72
43
-
This step includes the basics for setting up. Link to their docs.
73
+
+ For plain text content, you can use analyzers to control tokenization during indexing.
44
74
45
-
## Setting up an Azure AI agent
75
+
+ For multimodal or image content, you can use image verbalization for LLM-generated descriptions of your images, or classic OCR and image analysis via skillsets during indexing.
46
76
47
-
How to create a tool that connects to agent to agentic retrieval.
77
+
## Create the project
48
78
49
-
## Running your Azure AI agent
50
-
-->
51
-
<!--
52
-
### How to customize grounding data
79
+
The canonical use case for agentic retrieval is through the Azure AI Agent service. We recommend it because it's the easiest way to create a chatbot.
53
80
54
-
include reference data brings back retrievable index data. Similar to classic search. customizable.
81
+
An agent-to-agent solution combines Azure AI Search with Foundry projects that you use to build custom agents. An agent simplifies development by tracking conversation history and calling other tools.
55
82
56
-
response.content output is semantic fields and semantic config determines output.
83
+
You need endpoints for:
57
84
58
-
## Create the project
85
+
+ Azure AI Search
86
+
+ Azure OpenAI
87
+
+ Azure AI Foundry project
59
88
60
-
The canonical use case for agentic retrieval is through the Agent service. We recommend it because it's the easiest way to create a chatbot.
89
+
You can find endpoints for Azure AI Search and Azure OpenAI in the [Azure portal](https://portal.azure.com).
61
90
62
-
An agent-to-agent solution combines Azure AI Search with Foundry projects that you use to build custom agents. An agent service handles a lot of common problems,such as tracking conversation history and calling other tools.
91
+
You can find the project connection string in the Azure AI Foundry portal:
63
92
64
-
### Order of operations
93
+
1. Sign in to the [Azure AI Foundry portal](https://ai.azure.com) and open your project.
65
94
66
-
1. Call this.
67
-
1. Call that.
68
-
1. Pass the content string from the agent to the chat model. You shouldn't need to parse or serialize the string.
95
+
1. In the **Project details** tile, find and copy the **Project connection string**.
69
96
70
-
## Tips for improving performance
97
+
A hypothetical connection string might look like this: `eastus2.api.azureml.ms;00000000-0000-0000-0000-0000000000;rg-my-resource-group-name;my-foundry-project-name`
98
+
99
+
1. Check the authentication type for your Azure OpenAI resource and make sure it uses an API key shared to all projects. Still in **Project details**, expand the **Connected resources** tile to view the authentication type for your Azure OpenAI resource.
100
+
101
+
If you don't have an Azure OpenAI resource in your Foundry project, revisit the model deployment prerequisite. A connection to the resource is created when you deploy a model.
102
+
103
+
### Add an agentic retrieval tool to AI Agent
71
104
72
-
summarizing message threads
73
-
use gpt mini
105
+
An end-to-end pipeline needs an orchestration mechanism for coordinating calls to the retriever and agent. You can use a [tool](/azure/ai-services/agents/how-to/tools/function-calling) for this task. The tool calls the Azure AI Search knowledge retrieval client and the Azure AI agent, and it drives the conversations with the user.
74
106
75
107
## How to design a prompt
76
108
77
109
The prompt sent to the LLM includes instructions for working with the grounding data, which is passed as a large single string with no serialization or structure.
78
110
79
-
What does the prompt look like
111
+
The tool or function that you use to drive the pipeline provides the instructions to the LLM for the conversation.
112
+
113
+
```python
114
+
defagentic_retrieval() -> str:
115
+
"""
116
+
Searches a NASA e-book about images of Earth at night and other science related facts.
117
+
The returned string is in a JSON format that contains the reference id.
118
+
Be sure to use the same format in your agent's response
An Q&A agent that can answer questions about the Earth at night.
146
+
Sources have a JSON format with a ref_id that must be cited in the answer.
147
+
If you do not have the answer, respond with "I don't know".
148
+
"""
149
+
agent = project_client.agents.create_agent(
150
+
model=agent_model,
151
+
name=agent_name,
152
+
instructions=instructions
153
+
)
154
+
```
80
155
81
156
## Control the number of subqueries
82
157
83
-
The LLM will determine some quantity of subqueries based on the user query and chat history.
158
+
The LLM determines the quantity of subqueries based on these factors:
84
159
85
-
You as the developer can control by setting default max docs.
160
+
+ User query
161
+
+ Chat history
162
+
+ Semantic ranker input constraints
86
163
87
-
this is verbatim but it's only partially true because it's clear the LLM is creating subqueries based on other things
88
-
The best way to control the number of subqueries that are generated is by setting the `defaultMaxDocsForReranker` in either the agent definition or as an override on the retrieve action. The semantic ranker processes up to 50 documents as an input. If you only wanted two subqueries, you could set `defaultMaxDocsForReranker` to 100.
164
+
As the developer, the best way to control the number of subqueries is by setting the `defaultMaxDocsForReranker` in either the agent definition or as an override on the retrieve action.
165
+
166
+
The semantic ranker processes up to 50 documents as an input, and the system creates subqueries to accommodate all of the inputs to semantic ranker. For example, if you only wanted two subqueries, you could set `defaultMaxDocsForReranker` to 100 to accommodate all documents in two batches.
89
167
90
168
The [semantic configuration](semantic-how-to-configure.md) in the index determines whether the input is 50 or not. If the value is less, the query plan specifies however many subqueries are necessary to meet the `defaultMaxDocsForReranker` threshold.
91
169
92
170
## Control the number of threads in chat history
93
171
94
-
An agent object in Azure AI Search acquires chat history through API calls to the Azure Evaluations SDK, which maintains the thread history. You can filter this list to get a subset of the messages, for example the last 5 conversation turns.
172
+
An agent object in Azure AI Search acquires chat history through API calls to the Azure Evaluations SDK, which maintains the thread history. You can filter this list to get a subset of the messages, for example, the last five conversation turns.
95
173
96
174
## Control costs and limit operations
97
175
98
-
Look at output tokens in the [activity array](search-agentic-retrieval-how-to-retrieve.md#review-the-activity-array). -->
176
+
Look at output tokens in the [activity array](search-agentic-retrieval-how-to-retrieve.md#review-the-activity-array) for insights into the query plan.
177
+
178
+
## Tips for improving performance
179
+
180
+
+ Summarize message threads.
181
+
182
+
+ Use `gpt mini`.
183
+
184
+
+ Set `maxOutputSize` in the [search agent](search-agentic-retrieval-how-to-create.md) to govern the size of the response, or `maxRuntimeInSeconds` for time-bound processing.
99
185
100
186
## Related content
101
187
102
188
+[Agentic retrieval in Azure AI Search](search-agentic-retrieval-concept.md)
0 commit comments