You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/agentic-retrieval-how-to-create-pipeline.md
+18-5Lines changed: 18 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,29 +1,30 @@
1
1
---
2
-
title: Build an agentic retrieval solution
2
+
title: 'Tutorial: Build an agentic retrieval solution'
3
3
titleSuffix: Azure AI Search
4
4
description: Learn how to design and build a custom agentic retrieval solution where Azure AI Search handles data retrieval for your custom agents in AI Foundry.
5
5
author: HeidiSteen
6
6
ms.author: heidist
7
7
manager: nitinme
8
8
ms.date: 09/10/2025
9
9
ms.service: azure-ai-search
10
-
ms.topic: how-to
10
+
ms.topic: tutorial
11
11
ms.custom:
12
12
- build-2025
13
13
---
14
14
15
-
# Build an agent-to-agent retrieval solution using Azure AI Search
15
+
# Tutorial: Build an agent-to-agent retrieval solution using Azure AI Search
This article describes an approach or pattern for building a solution that uses Azure AI Search for knowledge retrieval, and how to integrate knowledge retrieval into a custom solution that includes Azure AI Agent. This pattern uses an agent tool to invoke an agentic retrieval pipeline in Azure AI Search.
20
20
21
21
:::image type="content" source="media/agentic-retrieval/agent-to-agent-pipeline.svg" alt-text="Diagram of Azure AI Search integration with Azure AI Agent service." lightbox="media/agentic-retrieval/agent-to-agent-pipeline.png" :::
22
22
23
-
This article supports the [agentic-retrieval-pipeline-example](https://github.com/Azure-Samples/azure-search-python-samples/tree/main/agentic-retrieval-pipeline-example) Python sample on GitHub.
24
-
25
23
This exercise differs from the [Agentic Retrieval Quickstart](search-get-started-agentic-retrieval.md) in how it uses Azure AI Agent to retrieve data from the index, and how it uses an agent tool for orchestration. If you want to understand the retrieval pipeline in its simplest form, begin with the quickstart.
26
24
25
+
> [!TIP]
26
+
> To run the code for this tutorial, download the [agentic-retrieval-pipeline-example](https://github.com/Azure-Samples/azure-search-python-samples/tree/main/agentic-retrieval-pipeline-example) Python sample on GitHub.
27
+
27
28
## Prerequisites
28
29
29
30
The following resources are required for this design pattern:
@@ -315,6 +316,18 @@ Look at output tokens in the [activity array](agentic-retrieval-how-to-retrieve.
315
316
316
317
+ Set `maxOutputSize` in the [knowledge agent](agentic-retrieval-how-to-create-knowledge-base.md) to govern the size of the response, or `maxRuntimeInSeconds` for time-bound processing.
317
318
319
+
## Clean up resources
320
+
321
+
When you're working in your own subscription, at the end of a project, it's a good idea to remove the resources that you no longer need. Resources left running can cost you money. You can delete resources individually or delete the resource group to delete the entire set of resources.
322
+
323
+
You can also delete individual objects:
324
+
325
+
+[Delete a knowledge agent](agentic-retrieval-how-to-create-knowledge-base.md#delete-an-agent)
326
+
327
+
+[Delete a knowledge source](agentic-knowledge-source-how-to-search-index.md#delete-a-knowledge-source)
328
+
329
+
+[Delete an index](search-how-to-manage-index.md#delete-an-index)
330
+
318
331
## Related content
319
332
320
333
+[Agentic retrieval in Azure AI Search](agentic-retrieval-overview.md)
Copy file name to clipboardExpand all lines: articles/search/agentic-retrieval-how-to-retrieve.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ This article also explains the three components of the retrieval response:
27
27
The retrieve request can include instructions for query processing that override the defaults set on the knowledge agent.
28
28
29
29
> [!NOTE]
30
-
> By default, there's no model-generated "answer" in the response and you should pass the extracted response to an LLM so that it can ground its answer based on the search results. For an end-to-end example that includes this step, see [Build an agent-to-agent retrieval solution ](agentic-retrieval-how-to-create-pipeline.md) or [Azure OpenAI Demo](https://github.com/Azure-Samples/azure-search-openai-demo).
30
+
> By default, there's no model-generated "answer" in the response and you should pass the extracted response to an LLM so that it can ground its answer based on the search results. For an end-to-end example that includes this step, see [Tutorial: Build an agent-to-agent retrieval solution ](agentic-retrieval-how-to-create-pipeline.md) or [Azure OpenAI Demo](https://github.com/Azure-Samples/azure-search-openai-demo).
31
31
>
32
32
>Alternatively, you can use [answer synthesis](agentic-retrieval-how-to-answer-synthesis.md) to bring answer formulation into the agentic pipeline. In this workflow, the retriever response consists of LLM-formulated answers instead of the raw search results.
In Azure AI Search, *agentic retrieval* is a new multi-query pipeline designed for complex questions posed by users or agents in chat and copilot apps. It's intended for [Retrieval Augmented Generation (RAG)](retrieval-augmented-generation-overview.md) patterns. Here's how it works:
20
+
What is agentic retrieval? In Azure AI Search, *agentic retrieval* is a new multi-query pipeline designed for complex questions posed by users or agents in chat and copilot apps. It's intended for [Retrieval Augmented Generation (RAG)](retrieval-augmented-generation-overview.md) patterns and agent-to-agent workflows.
21
+
22
+
Here's what it does:
21
23
22
24
+ Uses a large language model (LLM) to break down a complex query into smaller, focused subqueries for better coverage over your indexed content. Subqueries can include chat history for extra context.
23
25
24
26
+ Runs subqueries in parallel. Each subquery is semantically reranked to promote the most relevant matches.
25
27
26
28
+ Combines the best results into a unified response that an LLM can use to generate answers with your proprietary content.
27
29
28
-
This high-performance pipeline helps you generate high quality grounding data for your chat application, with the ability to answer complex questions quickly.
30
+
+ The response is modular yet comprehensive in how it also includes a query plan and source documents. You can choose to use just the search results as grounding data, or invoke the LLM to formulate an answer.
31
+
32
+
This high-performance pipeline helps you generate high quality grounding data (or an answer) for your chat application, with the ability to answer complex questions quickly.
29
33
30
34
Programmatically, agentic retrieval is supported through a new [Knowledge Agents object](/rest/api/searchservice/knowledge-agents?view=rest-searchservice-2025-08-01-preview&preserve-view=true) in the 2025-08-01-preview and 2025-05-01-preview data plane REST APIs and in Azure SDK preview packages that provide the feature. A knowledge agent's retrieval response is designed for downstream consumption by other agents and chat apps.
31
35
@@ -63,29 +67,31 @@ Agentic retrieval is designed for conversational search experiences that use an
63
67
64
68
### How it works
65
69
66
-
The agentic retrieval process follows three main phases:
70
+
The agentic retrieval process works as follows:
71
+
72
+
1.**Workflow initiation**: Your application calls a knowledge agent with retrieve action that provides a query and conversation history.
67
73
68
-
1.**Query planning**: A knowledge agent sends your query and conversation history to an LLM (gpt-4o, gpt-4.1, and gpt-5 series), which analyzes the context and breaks down complex questions into focused subqueries. This step is automated and not customizable. The number of subqueries depends on what the LLM decides and whether the `maxDocsForReranker` parameter is higher than 50. A new subquery is defined for each 50-document batch sent to semantic ranker.
74
+
1.**Query planning**: A knowledge agent sends your query and conversation history to an LLM, which analyzes the context and breaks down complex questions into focused subqueries. This step is automated and not customizable.
69
75
70
-
2.**Query execution**: All subqueries run simultaneously against your knowledge sources, using keyword, vector, and hybrid search. Each subquery undergoes semantic reranking to find the most relevant matches. References are extracted and retained for citation purposes.
76
+
1.**Query execution**: The knowledge agent sends the subqueries to your knowledge sources. All subqueries run simultaneously and can be keyword, vector, and hybrid search. Each subquery undergoes semantic reranking to find the most relevant matches. References are extracted and retained for citation purposes.
71
77
72
-
3.**Result synthesis**: The system merges and ranks all results, then returns a unified response containing grounding data, source references, and execution metadata.
78
+
1.**Result synthesis**: The system combines all results into a unified response with three parts: merged content, source references, and execution details.
73
79
74
-
Your search index determines query execution and any optimizations that occur during query execution. Specifically, if your index includes searchable text and vector fields, a hybrid query executes. The index semantic configuration, plus optional scoring profiles, synonym maps, analyzers, and normalizers (if you add filters) are all used during query execution. You must have named defaults for a semantic configuration and a scoring profile.
80
+
Your search index determines query execution and any optimizations that occur during query execution. Specifically, if your index includes searchable text and vector fields, a hybrid query executes. If the only searchable field is a vector field, then only pure vector search is used. The index semantic configuration, plus optional scoring profiles, synonym maps, analyzers, and normalizers (if you add filters) are all used during query execution. You must have named defaults for a semantic configuration and a scoring profile.
75
81
76
82
### Required components
77
83
78
84
| Component | Service | Role |
79
85
|-----------|---------|------|
80
86
|**LLM**| Azure OpenAI | Creates subqueries from conversation context and later uses grounding data for answer generation |
81
87
|**Knowledge agent**| Azure AI Search | Orchestrates the pipeline, connecting to your LLM and managing query parameters |
88
+
|**Knowledge source**| Azure AI Search | Wraps the search index with properties pertaining to knowledge agent usage |
82
89
|**Search index**| Azure AI Search | Stores your searchable content (text and vectors) with semantic configuration |
83
90
|**Semantic ranker**| Azure AI Search | Required component that reranks results for relevance (L2 reranking) |
84
-
|**Knowledge source**| Azure AI Search | Wraps the search index with properties pertaining to knowledge agent usage |
85
91
86
92
### Integration requirements
87
93
88
-
Your application drives the pipeline by calling the knowledge agent and handling the response. The pipeline returns grounding data that you pass to an LLM for answer generation in your conversation interface. For implementation details, see [Build an agent-to-agent retrieval solution](agentic-retrieval-how-to-create-pipeline.md).
94
+
Your application drives the pipeline by calling the knowledge agent and handling the response. The pipeline returns grounding data that you pass to an LLM for answer generation in your conversation interface. For implementation details, see [Tutorial: Build an agent-to-agent retrieval solution](agentic-retrieval-how-to-create-pipeline.md).
89
95
90
96
> [!NOTE]
91
97
> Only gpt-4o, gpt-4.1, and gpt-5 series models are supported for query planning. You can use any model for final answer generation.
@@ -109,8 +115,9 @@ Choose any of these options for your next step.
109
115
110
116
+[Create a search index knowledge source](agentic-knowledge-source-how-to-search-index.md) or a [blob knowledge source](agentic-knowledge-source-how-to-blob.md)
111
117
+[Create a knowledge agent](agentic-retrieval-how-to-create-knowledge-base.md)
118
+
+[Use answer synthesis for citation-backed responses](agentic-retrieval-how-to-answer-synthesis.md)
112
119
+[Use a knowledge agent to retrieve data](agentic-retrieval-how-to-retrieve.md)
113
-
+[Build an agent-to-agent retrieval solution](agentic-retrieval-how-to-create-pipeline.md)
120
+
+[Tutorial: Build an agent-to-agent retrieval solution](agentic-retrieval-how-to-create-pipeline.md)
114
121
115
122
+ REST API reference:
116
123
@@ -146,7 +153,7 @@ Semantic ranking is performed for every subquery in the plan. Semantic ranking c
146
153
147
154
Agentic retrieval has two billing models: billing from Azure OpenAI (query planning and, if enabled, answer synthesis) and billing from Azure AI Search for semantic ranking (query execution).
148
155
149
-
This example omits answer synthesis and uses hypothetical prices to illustrate the estimation process. Your costs could be lower. For the actual price of transactions, see [Azure OpenAI pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing). For query execution, there's no charge for semantic ranking for agentic retrieval in the initial public preview.
156
+
This pricing example omits answer synthesis, but helps illustrate the estimation process. Your costs could be lower. For the actual price of transactions, see [Azure OpenAI pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing). For query execution, there's no charge for semantic ranking for agentic retrieval in the initial public preview.
0 commit comments