You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Azure AI Search, a *knowledge agent* is a top-level resource representing a connection to a conversational language model for use in agentic retrieval workloads. It specifies a model that provides reasoning capabilities, and it identifies the search index used at query time.
18
+
In Azure AI Search, a *knowledge agent* is a top-level resource representing a connection to a chat completion model for use in agentic retrieval workloads. A knowledge agent specifies:
19
+
20
+
+ A model that provides reasoning capabilities
21
+
+ A search index used at query time
22
+
+ Parameters on the index for setting default response behavior
19
23
20
24
After you can create a knowledge agent, you can update its properties at any time. If the knowledge agent is in use, updates take effect on the next job.
21
25
22
26
## Prerequisites
23
27
24
28
+ Familiarity with [agentic retrieval concepts and use cases](search-agentic-retrieval-concept.md).
25
29
26
-
+ A conversational language model on Azure OpenAI, either gpt-4o or gpt-4o-mini.
30
+
+ A chat completion model on Azure OpenAI.
27
31
28
-
+ Azure AI Search, in any [region that provides semantic ranker](search-region-support.md), on basic tier and above. Your search service must have a [managed identity](search-howto-managed-identities-data-sources.md) for role-based access to a chat model.
32
+
+ Azure AI Search, in any [region that provides semantic ranker](search-region-support.md), on the basic pricing tier or higher. Your search service must have a [managed identity](search-howto-managed-identities-data-sources.md) for role-based access to the model.
29
33
30
-
+Permission requirements on Azure AI Search. An **Owner/Contributor** or **Search Service Contributor** can create and manage a knowledge agent. **Search Index Data Contributor** uploads and indexes document. **Search Index Data Reader**runs queries. Instructions are provided in this article.
34
+
+Permissions on Azure AI Search. **Search Service Contributor** can create and manage a knowledge agent. **Search Index Data Reader**can run queries. Instructions are provided in this article.
31
35
32
-
+ A search index containing plain text or vectors. The index must [meet requirements for agentic retrieval](search-agentic-retrieval-how-to-index.md), including a [semantic configuration](semantic-how-to-configure.md) with the `defaultConfiguration` specified.
36
+
+ A search index containing plain text or vectors. The index must [meet the requirements for agentic retrieval](search-agentic-retrieval-how-to-index.md), including a [semantic configuration](semantic-how-to-configure.md) with the `defaultConfiguration` specified.
33
37
34
-
+ API requirements. To create or use a knowledge agent, use 2025-05-01-preview data plane REST API or a prerelease package of an Azure SDK that provides knowledge agent APIs.
38
+
+ API requirements. To create or use a knowledge agent, use [2025-05-01-preview](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-05-01-preview&preserve-view=true) data plane REST API. Or, use a prerelease package of an Azure SDK that provides knowledge agent APIs: [Azure SDK for Python](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [Azure SDK for .NET](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/CHANGELOG.md#1170-beta3-2025-03-25), [Azure SDK for Java](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/search/azure-search-documents/CHANGELOG.md).
35
39
36
-
To follow the steps in this guide, we recommend [Visual Studio Code](https://code.visualstudio.com/download) with a [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client) for sending REST API calls to Azure AI Search. There's no portal support at this time.
40
+
To follow the steps in this guide, we recommend [Visual Studio Code](https://code.visualstudio.com/download) with a [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client) for sending preview REST API calls to Azure AI Search. There's no portal support at this time.
37
41
38
42
## Deploy a model for agentic retrieval
39
43
@@ -43,7 +47,9 @@ Make sure you have a supported model that Azure AI Search can access. The follow
43
47
44
48
1. Deploy a supported model using [these instructions](/azure/ai-foundry/how-to/deploy-models-openai).
45
49
46
-
1. Verify the search service managed identity has **Cognitive Services User** permissions on the Azure OpenAI resource. If you're testing locally, you also need **Cognitive Services User** permissions.
50
+
1. Verify the search service managed identity has **Cognitive Services User** permissions on the Azure OpenAI resource.
51
+
52
+
If you're testing locally, you also need **Cognitive Services User** permissions.
47
53
48
54
### Supported models
49
55
@@ -97,34 +103,37 @@ You can use API keys if you don't have permission to create role assignments.
97
103
98
104
# List Indexes
99
105
GET https://{{search-url}}/indexes?api-version=2025-05-01-preview
100
-
api-key: {{search-api-key}}
106
+
Content-Type: application/json
107
+
@api-key = <YOUR-SEARCH-SERVICE-API-KEY>
101
108
```
102
109
103
110
## Check for existing knowledge agents
104
111
105
-
The following request lists knowledge agents by name. Within the knowledge agents collection, all knowledge agents must be uniquely named. It's helpful for knowing about existing knowledge agents for reuse or naming purposes.
112
+
The following request lists knowledge agents by name on your search service. Within the knowledge agents collection, all knowledge agents are uniquely named. It's helpful for knowing about existing knowledge agents for reuse or naming purposes.
106
113
107
114
<!-- ### [**REST APIs**](#tab/rest-get) -->
108
115
109
116
```http
110
117
# List knowledge agents
111
118
GET https://{{search-url}}/agents?api-version=2025-05-01-preview
112
-
api-key: {{search-api-key}}
119
+
Content-Type: application/json
120
+
@token = <a long GUID>
113
121
```
114
122
115
-
You can also return a single agent by name.
123
+
You can also return a single agent by name to review its JSON definition.
116
124
117
125
```http
118
126
# Get knowledge agent
119
127
GET https://{{search-url}}/agents/{{agent-name}}?api-version=2025-05-01-preview
120
-
api-key: {{search-api-key}}
128
+
Content-Type: application/json
129
+
@token = <a long GUID>
121
130
```
122
131
123
132
<!-- --- -->
124
133
125
134
## Create a knowledge agent
126
135
127
-
A knowledge agent represents a connection to a model that you've deployed. Parameters on the model establish the connection.
136
+
A knowledge agent represents a connection between a model that you've deployed in Azure OpenAI and a target index on Azure AI Search. Parameters on the model establish the connection. Parameters on the index establish defaults that inform query execution and the response.
128
137
129
138
<!-- ### [**REST APIs**](#tab/rest-create) -->
130
139
@@ -136,12 +145,12 @@ To create an agent, use the 2025-05-01-preview data plane REST API or an Azure S
+`name` must be unique within the knowledge agents collection it must adhere to [naming rules](/rest/api/searchservice/naming-rules) for objects on Azure AI Search.
186
+
+`name` must be unique within the knowledge agents collection and follow the [naming guidelines](/rest/api/searchservice/naming-rules) for objects on Azure AI Search.
178
187
179
188
+`targetIndexes` is required for knowledge agent creation. It lists the search indexes that can use the knowledge agent. Currently in this preview release, the `targetIndexes` array can contain only one index. *It must have a default semantic configuration* (`defaultConfiguration`). For more information, see [Design an index for agentic retrieval](search-agentic-retrieval-how-to-index.md).
180
189
@@ -213,8 +222,8 @@ Replace "What are my vision benefits?" with a query string that's valid for your
213
222
```http
214
223
# Send Grounding Request
215
224
POST https://{{search-url}}/agents/{{agent-name}}/retrieve?api-version=2025-05-01-preview
216
-
api-key: {{search-api-key}}
217
-
Content-Type: application/json
225
+
Content-Type: application/json
226
+
@token = <a long GUID>
218
227
219
228
{
220
229
"messages" : [
@@ -247,14 +256,18 @@ For more information about the **retrieve** API and the shape of the response, s
247
256
248
257
## Delete an agent
249
258
259
+
If you no longer need the agent, or if you need to rebuild it on the search service, use this request to delete the current object.
In Azure AI Search, *agentic retrieval* is a new parallel query architecture that uses a conversational large language model (LLM) for query planning, generating subqueries that broaden the scope of what's searchable and relevant.
18
+
In Azure AI Search, *agentic retrieval* is a new parallel query architecture that uses a chat completion model for query planning. It generates subqueries that broaden the scope of what's searchable and relevant.
19
19
20
20
This article explains how to use the [**retrieve** method](/rest/api/searchservice/knowledge-retrieval/retrieve?view=rest-searchservice-2025-05-01-preview&preserve-view=true) that invokes a knowledge agent and parallel query processing. This article also explains the three components of the retrieval response:
21
21
22
22
+*extracted response for the LLM*
23
23
+*referenced results*
24
24
+*query activity*
25
25
26
+
The retrieve request can include instructions for query processing that override the defaults set on the knowledge agent.
27
+
26
28
> [!NOTE]
27
-
> Currently, there's no model-generated "answer" in the response. Instead, the response provides grounding data that you can use to generate an answer from an LLM. For an end-to-end example, see [Build an agent-to-agent retrieval solution ](search-agentic-retrieval-how-to-pipeline.md) or [Azure OpenAI Demo](https://github.com/Azure-Samples/azure-search-openai-demo).
29
+
> There's no model-generated "answer" in the response. Instead, the response provides grounding data used to generate an answer from an LLM. For an end-to-end example, see [Build an agent-to-agent retrieval solution ](search-agentic-retrieval-how-to-pipeline.md) or [Azure OpenAI Demo](https://github.com/Azure-Samples/azure-search-openai-demo).
28
30
29
31
## Prerequisites
30
32
31
-
+ A [knowledge agent definition](search-agentic-retrieval-how-to-create.md) that represents a conversational language model.
33
+
+ A [knowledge agent](search-agentic-retrieval-how-to-create.md) that represents the chat completion model and a valid target index.
34
+
35
+
+ Azure AI Search, in any [region that provides semantic ranker](search-region-support.md), on basic tier and higher. Your search service must have a [managed identity](search-howto-managed-identities-data-sources.md) for role-based access to a chat completion model.
32
36
33
-
+ Azure AI Search, in any [region that provides semantic ranker](search-region-support.md), on basic tier and above. Your search service must have a [managed identity](search-howto-managed-identities-data-sources.md) for role-based access to a chat model.
37
+
+Permissions on Azure AI Search. **Search Index Data Reader** can run queries on Azure AI Search, but the search service managed identity must have **Cognitive Services User** permissions on the Azure OpenAI resource. For more information about local testing and obtaining access tokens, see [Quickstart: Connect without keys](search-get-started-rbac.md).
34
38
35
-
+ API requirements. Use 2025-05-01-preview data plane REST API or a prerelease package of an Azure SDK that provides knowledge agent APIs.
39
+
+ API requirements. To create or use a knowledge agent, use [2025-05-01-preview](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-05-01-preview&preserve-view=true) data plane REST API. Or, use a prerelease package of an Azure SDK that provides knowledge agent APIs: [Azure SDK for Python](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md), [Azure SDK for .NET](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/CHANGELOG.md#1170-beta3-2025-03-25), [Azure SDK for Java](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/search/azure-search-documents/CHANGELOG.md).
36
40
37
41
To follow the steps in this guide, we recommend [Visual Studio Code](https://code.visualstudio.com/download) with a [REST client](https://marketplace.visualstudio.com/items?itemName=humao.rest-client) for sending REST API calls to Azure AI Search. There's no portal support at this time.
38
42
39
43
## Call the retrieve action
40
44
41
45
Call the **retrieve** action on the knowledge agent object to invoke retrieval and return a response. Use the [2025-05-01-preview](/rest/api/searchservice/operation-groups?view=rest-searchservice-2025-05-01-preview&preserve-view=true) data plane REST API or an Azure SDK prerelease package that provides equivalent functionality for this task.
42
46
47
+
All `searchable` fields in the search index are in-scope for query execution. If the index includes vector fields, your index should have a valid vectorizer definition so that it can vectorize the query inputs. Otherwise, vector fields are ignored. The implied query type is `semantic`, and there's no search mode or selection of search fields.
48
+
43
49
The input for the retrieval route is chat conversation history in natural language, where the `messages` array contains the conversation.
44
50
45
51
```http
46
52
# Send Grounding Request
47
53
POST https://{{search-url}}/agents/{{agent-name}}/retrieve?api-version=2025-05-01-preview
48
-
api-key: {{search-api-key}}
49
-
Content-Type: application/json
54
+
@accessToken=<YOUR PERSONAL ID>
55
+
Content-Type: application/json
50
56
51
57
{
52
58
"messages" : [
53
59
{
54
-
"role" : "system",
60
+
"role" : "assistant",
55
61
"content" : [
56
62
{ "type" : "text", "text" : "You are a helpful assistant for Contoso Human Resources. You have access to a search index containing guidelines about health care coverage for Washington state. If you can't find the answer in the search, say you don't know." }
57
63
]
58
64
},
59
65
{
60
66
"role" : "user",
61
67
"content" : [
62
-
{ "type" : "text", "text" : "What are my vision benefits?" }
68
+
{ "type" : "text", "text" : "What are my options for health care coverage" }
69
+
]
70
+
},
71
+
{
72
+
"role" : "user",
73
+
"content" : [
74
+
{ "type" : "text", "text" : "Which one has vision benefits" }
63
75
]
64
76
}
65
77
],
@@ -68,7 +80,7 @@ Content-Type: application/json
68
80
"indexName" : "{{index-name}}",
69
81
"filterAddOn" : "State eq 'WA'",
70
82
"IncludeReferenceSourceData": true,
71
-
"rerankerThreshold" : 2.5,
83
+
"rerankerThreshold" : 2.5,
72
84
"maxDocsForReranker": 250
73
85
}
74
86
]
@@ -79,7 +91,7 @@ Content-Type: application/json
79
91
80
92
+`messages` articulates the messages sent to the model. The message format is similar to Azure OpenAI APIs.
81
93
82
-
+`role` defines where the message came from, for example either `system` or `user`. The model you use determines which roles are valid.
94
+
+`role` defines where the message came from, for example either `assistant` or `user`. The model you use determines which roles are valid.
83
95
84
96
+`content` is the message sent to the LLM. It must be text in this preview.
85
97
@@ -93,7 +105,9 @@ Content-Type: application/json
93
105
94
106
`rerankerThreshold` is the minimum semantic reranker score that's acceptable for inclusion in a response. [Reranker scores](semantic-search-overview.md#how-ranking-is-scored) range from 1 to 4. Plan on revising this value based on testing and what works for your content.
95
107
96
-
`maxDocsForReranker` dictates the maximum number of documents to consider for the final response string. Semantic reranker accepts 50 documents. If the maximum is 200, four more subqueries are added to the query plan to ensure all 200 documents are semantically ranked. for semantic ranking. If the number isn't evenly divisible by 50, the query plan rounds up to nearest whole number.
108
+
`maxDocsForReranker` dictates the maximum number of documents to consider for the final response string. Semantic reranker accepts 50 documents. If the maximum is 200, four more subqueries are added to the query plan to ensure all 200 documents are semantically ranked. for semantic ranking. If the number isn't evenly divisible by 50, the query plan rounds up to nearest whole number.
109
+
110
+
The `content` portion of the response consists of the 200 chunks or less, excluding any results that fail to meet the minimum threshold of a 2.5 reranker score.
97
111
98
112
## Review the extracted response
99
113
@@ -104,7 +118,7 @@ The body of the response is also structured in the chat message style format. Cu
104
118
```http
105
119
"response": [
106
120
{
107
-
"role": "system",
121
+
"role": "assistant",
108
122
"content": [
109
123
{
110
124
"type": "text",
@@ -119,7 +133,7 @@ The body of the response is also structured in the chat message style format. Cu
119
133
120
134
The `maxOutputSize` property on the knowledge agent determines the length of the string. We recommend 5,000 tokens.
121
135
122
-
Fields in the content `text` response string include the ref_id and semantic configuration fields: `title`, `terms`, `terms`.
136
+
Fields in the content `text` response string include the ref_id and semantic configuration fields: `title`, `terms`, `content`.
123
137
124
138
## Review the activity array
125
139
@@ -175,6 +189,8 @@ Here's an example of an activity array.
175
189
176
190
The `references` array is a direct reference from the underlying grounding data and includes the `sourceData` used to generate the response. It consists of every single document that was found and semantically ranked by the search engine. Fields in the `sourceData` include an `id` and semantic fields: `title`, `terms`, `content`.
177
191
192
+
The `id` is a reference ID for an item within a specific response. It's not the document key in the search index. It's used for providing citations.
193
+
178
194
The purpose of this array is to provide a chat message style structure for easy integration. For example, if you want to serialize the results into a different structure or you require some programmatic manipulation of the data before you returned it to the user.
179
195
180
196
You can also get the structured data from the source data object in the references array to manipulate it however you see fit.
@@ -222,4 +238,6 @@ The `includeReferenceSourceData` parameter tells the search engine to provide gr
222
238
223
239
+[Agentic retrieval in Azure AI Search](search-agentic-retrieval-concept.md)
224
240
241
+
+[Agentic RAG: build a reasoning retrieval engine with Azure AI Search](https://www.youtube.com/watch?v=PeTmOidqHM8)
0 commit comments