-
Notifications
You must be signed in to change notification settings - Fork 5k
Upgrade to latest version of azure-search-documents and agentic retrieval API #2723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@pamelafox - where are you at with this? Do you need a hand with anything? |
@taylorn-ai Tests just passed, and I just verified this is working with the multimodal feature, so this is ready for review! I'd love if you want to review the code and/or check out the branch to see if it works for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request upgrades the Azure Search Documents SDK to version 11.7.0b1 and refactors the agentic retrieval integration to use the latest API. The new API includes reference source data directly, eliminating the need for explicit hydration, and removes runtime customization of max subqueries.
Key changes include:
- Upgraded azure-search-documents to 11.7.0b1 for latest agentic retrieval API support
- Replaced legacy agentic retrieval classes with new API types and simplified reference handling
- Removed max_subqueries parameter and hydration-related code as these are no longer supported
Reviewed Changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
app/backend/requirements.in | Updated azure-search-documents version to 11.7.0b1 |
app/backend/requirements.txt | Updated dependencies with new azure-search-documents version |
app/backend/approaches/approach.py | Replaced legacy agentic retrieval types with new API classes and removed hydration logic |
app/backend/approaches/chatreadretrieveread.py | Removed hydrate_references parameter and max_docs_for_reranker calculations |
app/backend/approaches/retrievethenread.py | Removed hydrate_references parameter and max_docs_for_reranker calculations |
app/backend/prepdocslib/searchmanager.py | Updated agent creation to use SearchIndexKnowledgeSource and new reference types |
app/backend/app.py | Removed ENABLE_AGENTIC_RETRIEVAL_SOURCE_DATA environment variable usage |
app/frontend/src/api/models.ts | Removed max_subqueries from ChatAppRequestOverrides type |
app/frontend/src/pages/chat/Chat.tsx | Removed max subqueries UI setting and state management |
app/frontend/src/pages/ask/Ask.tsx | Removed max subqueries UI setting and state management |
app/frontend/src/components/Settings/Settings.tsx | Removed max subqueries input field from developer settings |
app/frontend/src/components/AnalysisPanel/AgentPlan.tsx | Updated activity record type names and property access |
app/frontend/src/locales/*/translation.json | Removed max subqueries translations from all language files |
infra/main.bicep | Removed enableAgenticRetrievalSourceData parameter and updated agent name suffix |
infra/main.parameters.json | Removed ENABLE_AGENTIC_RETRIEVAL_SOURCE_DATA parameter mapping |
docs/*.md | Updated documentation to remove compatibility warnings and max subqueries references |
evals/*.json | Removed max_subqueries from evaluation configuration files |
tests/ | Updated test mocks and removed hydration-related test cases |
Comments suppressed due to low confidence (1)
app/frontend/src/components/AnalysisPanel/AgentPlan.tsx:1
- The
AzureSearchQueryStep
type includes aquery_time
field that doesn't appear to be used anywhere in the component. Consider removing this unused field or documenting why it's included if it's intended for future use.
import React from "react";
@pamelafox - looks good to me, nice work :) I did notice however, that many of the translation files are missing some keys. I noticed this only because you removed the
|
@taylorn-ai Oo thanks! I did not know about i18n-check, that sounds like a new CI check that we need. |
The one issue I have with it is that it uses Also, I used i18n-auto-translation to translate, and you can even use Azure AI Translation with it, just thought I would mention that too :) |
@taylorn-ai Yep, you're right, it does have a bunch of dependency warnings. I've added it to the CI using npx so that it doesnt have to go in the package.json at all. I generated the translations with GPT-5 in Copilot, which does a decent job usually, but I'll ping some human i18n reviewers too. |
Actually, something I just noticed, instead of hard coding the field names, maybe they should be fetched dynamically? e.g. client = SearchIndexClient(endpoint=endpoint, credential=DefaultAzureCredential())
index = client.get_index(index_name)
field_names = [f.name for f in index.fields if f.searchable]
...
source_data_select = ",".join(field_names)
... Or, perhaps better... from dataclasses import fields
from approaches.approach import Document
skip_fields = {"score", "score", "reranker_score", "search_agent_query"}
search_fields = [f.name for f in fields(Document) if f.name not in skip_fields] |
@pamelafox - sorry for the spam, but I did actually notice an issue, not specifically related to this PR, but it made me remember. It seems at some point, |
Co-authored-by: Gwen Peña-Siguenza <[email protected]> Co-authored-by: Wassim Chegham <[email protected]> Co-authored-by: Anthony Shaw <[email protected]>
@taylorn-ai Hm, I just printed out the values in approach.py from AI Search (non-agentic), and it shows the score for @search.reranker_score, but a null value from @search.rerankerScore |
Co-authored-by: Copilot <[email protected]>
@taylorn-ai The PR is merged, but do follow-up on rerankerScore if you still see an issue (here or with new issue) |
The documentation says that the field is called
However, after looking a bit further, it seems just the SDK returns it as |
@taylorn-ai I asked @mattgotteiner and he says that's due to the Python SDK explicitly snake_casing the API return values. |
Purpose
This pull request introduces a major refactor to the agentic retrieval integration, updating the codebase to use the latest Azure AI Search agentic retrieval API.
The new API can optionally include the reference source data (all the fields from each chunk), so we no longer need explicit hydration.
The new API does not support passing in max subqueries at query time, so I've removed that as a Developer Setting. That can only be customized in the search manager, at agent creation time.
This is the changelog for the package upgrade:
https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/search/azure-search-documents/CHANGELOG.md
And these are the API specs:
https://github.com/Azure/azure-rest-api-specs/blob/a71c94fb88b21af5c99442fd138b2570fc29622b/specification/search/data-plane/Azure.Search/preview/2025-08-01-preview/searchservice.json#L2701
Agentic retrieval API and data model updates:
KnowledgeAgentAzureSearchDocReference
,KnowledgeAgentIndexParams
, and hydration logic) with new types (KnowledgeAgentSearchIndexReference
,SearchIndexKnowledgeSourceParams
, etc.) and simplified reference handling inapproach.py
. Removed unused hydration and reranker-related code. [1] [2] [3] [4]searchmanager.py
to useSearchIndexKnowledgeSource
andKnowledgeSourceReference
instead ofKnowledgeAgentTargetIndex
, and now explicitly selects source fields and reference options. [1] [2]Parameter and code cleanup:
hydrate_references
,minimum_reranker_score
, andmax_docs_for_reranker
parameters from constructors and method calls inapproach.py
,chatreadretrieveread.py
, andretrievethenread.py
. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]Dependency updates:
azure-search-documents
package to version11.7.0b1
in bothrequirements.in
andrequirements.txt
to support the new agent API features. [1] [2]Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
Does this require changes to learn.microsoft.com docs?
This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest
).python -m pytest --cov
to verify 100% coverage of added linespython -m mypy
to check for type errorsruff
andblack
manually on my code.