You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Azure AI Search, *agentic retrieval* is a new multi-query pipeline designed for complex questions posed by users or agents in chat and copilot apps. It's intended for [Retrieval Augmented Generation (RAG)](retrieval-augmented-generation-overview.md) patterns. Here's how it works:
20
+
What is agentic retrieval? In Azure AI Search, *agentic retrieval* is a new multi-query pipeline designed for complex questions posed by users or agents in chat and copilot apps. It's intended for [Retrieval Augmented Generation (RAG)](retrieval-augmented-generation-overview.md) patterns and agent-to-agent workflows.
21
+
22
+
Here's what it does:
21
23
22
24
+ Uses a large language model (LLM) to break down a complex query into smaller, focused subqueries for better coverage over your indexed content. Subqueries can include chat history for extra context.
23
25
24
26
+ Runs subqueries in parallel. Each subquery is semantically reranked to promote the most relevant matches.
25
27
26
28
+ Combines the best results into a unified response that an LLM can use to generate answers with your proprietary content.
27
29
28
-
This high-performance pipeline helps you generate high quality grounding data for your chat application, with the ability to answer complex questions quickly.
30
+
+ The response is modular yet comprehensive in how it also includes a query plan and source documents. You can choose to use just the search results as grounding data, or invoke the LLM to formulate an answer.
31
+
32
+
This high-performance pipeline helps you generate high quality grounding data (or an answer) for your chat application, with the ability to answer complex questions quickly.
29
33
30
34
Programmatically, agentic retrieval is supported through a new [Knowledge Agents object](/rest/api/searchservice/knowledge-agents?view=rest-searchservice-2025-08-01-preview&preserve-view=true) in the 2025-08-01-preview and 2025-05-01-preview data plane REST APIs and in Azure SDK preview packages that provide the feature. A knowledge agent's retrieval response is designed for downstream consumption by other agents and chat apps.
31
35
@@ -63,25 +67,27 @@ Agentic retrieval is designed for conversational search experiences that use an
63
67
64
68
### How it works
65
69
66
-
The agentic retrieval process follows three main phases:
70
+
The agentic retrieval process works as follows:
71
+
72
+
1.**Workflow initiation**: Your application calls a knowledge agent with retrieve action that provides a query and conversation history.
67
73
68
-
1.**Query planning**: A knowledge agent sends your query and conversation history to an LLM (gpt-4o, gpt-4.1, and gpt-5 series), which analyzes the context and breaks down complex questions into focused subqueries. This step is automated and not customizable. The number of subqueries depends on what the LLM decides and whether the `maxDocsForReranker` parameter is higher than 50. A new subquery is defined for each 50-document batch sent to semantic ranker.
74
+
1.**Query planning**: A knowledge agent sends your query and conversation history to an LLM, which analyzes the context and breaks down complex questions into focused subqueries. This step is automated and not customizable.
69
75
70
-
2.**Query execution**: All subqueries run simultaneously against your knowledge sources, using keyword, vector, and hybrid search. Each subquery undergoes semantic reranking to find the most relevant matches. References are extracted and retained for citation purposes.
76
+
1.**Query execution**: The knowledge agent sends the subqueries to your knowledge sources. All subqueries run simultaneously and can be keyword, vector, and hybrid search. Each subquery undergoes semantic reranking to find the most relevant matches. References are extracted and retained for citation purposes.
71
77
72
-
3.**Result synthesis**: The system merges and ranks all results, then returns a unified response containing grounding data, source references, and execution metadata.
78
+
1.**Result synthesis**: The system combines all results into a unified response with three parts: merged content, source references, and execution details.
73
79
74
-
Your search index determines query execution and any optimizations that occur during query execution. Specifically, if your index includes searchable text and vector fields, a hybrid query executes. The index semantic configuration, plus optional scoring profiles, synonym maps, analyzers, and normalizers (if you add filters) are all used during query execution. You must have named defaults for a semantic configuration and a scoring profile.
80
+
Your search index determines query execution and any optimizations that occur during query execution. Specifically, if your index includes searchable text and vector fields, a hybrid query executes. If the only searchable field is a vector field, then only pure vector search is used. The index semantic configuration, plus optional scoring profiles, synonym maps, analyzers, and normalizers (if you add filters) are all used during query execution. You must have named defaults for a semantic configuration and a scoring profile.
75
81
76
82
### Required components
77
83
78
84
| Component | Service | Role |
79
85
|-----------|---------|------|
80
86
|**LLM**| Azure OpenAI | Creates subqueries from conversation context and later uses grounding data for answer generation |
81
87
|**Knowledge agent**| Azure AI Search | Orchestrates the pipeline, connecting to your LLM and managing query parameters |
88
+
|**Knowledge source**| Azure AI Search | Wraps the search index with properties pertaining to knowledge agent usage |
82
89
|**Search index**| Azure AI Search | Stores your searchable content (text and vectors) with semantic configuration |
83
90
|**Semantic ranker**| Azure AI Search | Required component that reranks results for relevance (L2 reranking) |
84
-
|**Knowledge source**| Azure AI Search | Wraps the search index with properties pertaining to knowledge agent usage |
85
91
86
92
### Integration requirements
87
93
@@ -109,6 +115,7 @@ Choose any of these options for your next step.
109
115
110
116
+[Create a search index knowledge source](agentic-knowledge-source-how-to-search-index.md) or a [blob knowledge source](agentic-knowledge-source-how-to-blob.md)
111
117
+[Create a knowledge agent](agentic-retrieval-how-to-create-knowledge-base.md)
118
+
+[Use answer synthesis for citation-backed responses](agentic-retrieval-how-to-answer-synthesis.md)
112
119
+[Use a knowledge agent to retrieve data](agentic-retrieval-how-to-retrieve.md)
113
120
+[Build an agent-to-agent retrieval solution](agentic-retrieval-how-to-create-pipeline.md)
114
121
@@ -146,7 +153,7 @@ Semantic ranking is performed for every subquery in the plan. Semantic ranking c
146
153
147
154
Agentic retrieval has two billing models: billing from Azure OpenAI (query planning and, if enabled, answer synthesis) and billing from Azure AI Search for semantic ranking (query execution).
148
155
149
-
This example omits answer synthesis and uses hypothetical prices to illustrate the estimation process. Your costs could be lower. For the actual price of transactions, see [Azure OpenAI pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing). For query execution, there's no charge for semantic ranking for agentic retrieval in the initial public preview.
156
+
This pricing example omits answer synthesis, but helps illustrate the estimation process. Your costs could be lower. For the actual price of transactions, see [Azure OpenAI pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/#pricing). For query execution, there's no charge for semantic ranking for agentic retrieval in the initial public preview.
0 commit comments