You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/search-agentic-retrieval-how-to-pipeline.md
+67-18Lines changed: 67 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,11 @@
1
1
---
2
2
title: Build an agentic retrieval solution
3
3
titleSuffix: Azure AI Search
4
-
description: Learn how to design and build a custom agentic retrieval solution where Azure AI Search handles data retrieval for your custom agents.
4
+
description: Learn how to design and build a custom agentic retrieval solution where Azure AI Search handles data retrieval for your custom agents in AI Foundry.
5
5
author: HeidiSteen
6
6
ms.author: heidist
7
7
manager: nitinme
8
-
ms.date: 08/29/2025
8
+
ms.date: 09/10/2025
9
9
ms.service: azure-ai-search
10
10
ms.topic: how-to
11
11
ms.custom:
@@ -53,9 +53,9 @@ Use one of the following chat completion models with your AI agent:
53
53
Use a package version that provides preview functionality. See the [`requirements.txt`](https://github.com/Azure-Samples/azure-search-python-samples/blob/main/agentic-retrieval-pipeline-example/requirements.txt) file for more packages used in the example solution.
54
54
55
55
```
56
-
azure-ai-projects==1.0.0b11
57
-
azure-ai-agents==1.0.0
58
-
azure-search-documents==11.6.0b12
56
+
azure-ai-projects==1.1.0b3
57
+
azure-ai-agents==1.2.0b3
58
+
azure-search-documents==11.7.0b1
59
59
```
60
60
61
61
### Configure access
@@ -100,9 +100,16 @@ Azure OpenAI hosts the models used by the agentic retrieval pipeline. Configure
100
100
101
101
Development tasks on the Azure AI Search side include:
102
102
103
-
+ Create a knowledge agent on Azure AI Search that maps to your deployed model in Azure AI Foundry Model.
104
-
+ Call the retriever and provide a query, conversation, and override parameters.
105
-
+ Parse the response for the parts you want to include in your chat application. For many scenarios, just the content portion of the response is sufficient.
103
+
+[Create a knowledge source](search-knowledge-source-overview.md) that maps to a [searchable index](search-agentic-retrieval-how-to-index.md).
104
+
+[Create a knowledge agent](search-agentic-retrieval-how-to-create.md) on Azure AI Search that maps to your deployed model in Azure AI Foundry Model.
105
+
+[Call the retriever](search-agentic-retrieval-how-to-retrieve.md) and provide a query, conversation, and override parameters.
106
+
+ Parse the response for the parts you want to include in your chat application. For many scenarios, just the content portion of the response is sufficient. You can also try [answer synthesis](search-agentic-retrieval-how-to-synthesize.md) for a simpler workflow.
107
+
108
+
Developments in Azure AI Agent side include:
109
+
110
+
+ Set up the AI project client and an AI agent.
111
+
+ Add a tool to coordinate calls from the AI agent to the retriever and knowledge agent.
112
+
+ Query processing is driven by the tool, where the tool calls both the AI agent and the retriever on Azure AI Search.
106
113
107
114
## Components of the solution
108
115
@@ -169,20 +176,20 @@ print(f"AI agent '{agent_name}' created or updated successfully")
169
176
170
177
### Add an agentic retrieval tool to AI Agent
171
178
172
-
An end-to-end pipeline needs an orchestration mechanism for coordinating calls to the retriever and knowledge agent. You can use a [tool](/azure/ai-services/agents/how-to/tools/function-calling) for this task. The tool calls the Azure AI Search knowledge retrieval client and the Azure AI agent, and it drives the conversations with the user.
179
+
An end-to-end pipeline needs an orchestration mechanism for coordinating calls to the retriever and knowledge agent on Azure AI Search. You can use a [tool](/azure/ai-services/agents/how-to/tools/function-calling) for this task. The tool calls the Azure AI Search knowledge retrieval client and the Azure AI agent, and it drives the conversations with the user.
173
180
174
181
```python
175
182
from azure.ai.agents.models import FunctionTool, ToolSet, ListSortOrder
176
183
177
184
from azure.search.documents.agent import KnowledgeAgentRetrievalClient
178
-
from azure.search.documents.agent.models import KnowledgeAgentRetrievalRequest, KnowledgeAgentMessage, KnowledgeAgentMessageTextContent, KnowledgeAgentIndexParams
185
+
from azure.search.documents.agent.models import KnowledgeAgentRetrievalRequest, KnowledgeAgentMessage, KnowledgeAgentMessageTextContent
The messages sent to the agent tool include instructions for chat history and using the results obtained from [knowledge retrieval](/rest/api/searchservice/knowledge-retrieval/retrieve?view=rest-searchservice-2025-08-01-preview&preserve-view=true) on Azure AI Search. The response is passed as a large single string with no serialization or structure.
196
203
204
+
This code snippet is the agentic retrieval definition mentioned in the previous code snippet.
205
+
197
206
```python
198
207
defagentic_retrieval() -> str:
199
208
"""
200
209
Searches a NASA e-book about images of Earth at night and other science related facts.
201
210
The returned string is in a JSON format that contains the reference id.
202
211
Be sure to use the same format in your agent's response
To start the chat, use the standard Azure AI agent tool calling APIs. Send the message with questions, and the agent decides when to retrieve knowledge from your search index using agentic retrieval.
241
+
242
+
```python
243
+
from azure.ai.agents.models import AgentsNamedToolChoice, AgentsNamedToolChoiceType, FunctionName
244
+
245
+
message = project_client.agents.messages.create(
246
+
thread_id=thread.id,
247
+
role="user",
248
+
content="""
249
+
Why do suburban belts display larger December brightening than urban cores even though absolute light levels are higher downtown?
250
+
Why is the Phoenix nighttime street grid is so sharply visible from space, whereas large stretches of the interstate between midwestern cities remain comparatively dim?
251
+
"""
252
+
)
253
+
254
+
run = project_client.agents.runs.create_and_process(
Search results are consolidated into a large unified string that you can pass to a chat completion model for a grounded answer. The following indexing and relevance tuning features in Azure AI Search are available to help you generate high quality results. You can implement these features in the search index, and the improvements in search relevance are evident in the quality of the response returned during retrieval.
@@ -240,11 +283,17 @@ The LLM determines the quantity of subqueries based on these factors:
240
283
+ Chat history
241
284
+ Semantic ranker input constraints
242
285
243
-
As the developer, the best way to control the number of subqueries is by setting the `defaultMaxDocsForReranker` in either the knowledge agent definition or as an override on the retrieve action.
286
+
As the developer, the best way to control the number of subqueries is by setting the [maxSubQueries](/rest/api/searchservice/knowledge-agents/create-or-update?view=rest-searchservice-2025-08-01-preview#knowledgesourcereference&preserve-view=true) property in a knowledge agent.
287
+
288
+
The semantic ranker processes up to 50 documents as an input, and the system creates subqueries to accommodate all of the inputs to semantic ranker. For example, if you only wanted two subqueries, you could set `maxSubQueries` to 100 to accommodate all documents in two batches.
289
+
290
+
The [semantic configuration](semantic-how-to-configure.md) in the index determines whether the input is 50 or not. If the value is less, the query plan specifies however many subqueries are necessary to meet the smaller input size.
291
+
292
+
<!-- As the developer, the best way to control the number of subqueries is by setting the `defaultMaxDocsForReranker` in either the knowledge agent definition or as an override on the retrieve action.
244
293
245
294
The semantic ranker processes up to 50 documents as an input, and the system creates subqueries to accommodate all of the inputs to semantic ranker. For example, if you only wanted two subqueries, you could set `defaultMaxDocsForReranker` to 100 to accommodate all documents in two batches.
246
295
247
-
The [semantic configuration](semantic-how-to-configure.md) in the index determines whether the input is 50 or not. If the value is less, the query plan specifies however many subqueries are necessary to meet the `defaultMaxDocsForReranker` threshold.
296
+
The [semantic configuration](semantic-how-to-configure.md) in the index determines whether the input is 50 or not. If the value is less, the query plan specifies however many subqueries are necessary to meet the `defaultMaxDocsForReranker` threshold.-->
0 commit comments