-
Notifications
You must be signed in to change notification settings - Fork 5k
Updates to integrated vectorization #2045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if self.embeddings is None: | ||
raise ValueError("Expecting Azure Open AI instance") | ||
|
||
await search_manager.create_index( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we always make a vectorizer, even when not using integrated vectorization.
else SearchField( | ||
name="id", | ||
|
||
if self.search_info.index_name not in [name async for name in search_index_client.list_index_names()]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code change looks larger than it is, because I moved all the index creation logic inside the if statement. Before, we would create the index object even if we didn't use it, which seemed an odd thing to do!
sortable=False, | ||
facetable=False, | ||
vector_search_dimensions=1024, | ||
vector_search_dimensions=self.embedding_dimensions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Git is not diffing this field correctly, the imageEmbedding field is still 1024.
prioritized_fields=SemanticPrioritizedFields( | ||
title_field=None, content_fields=[SemanticField(field_name="content")] | ||
vectorizers = [] | ||
if self.embeddings and isinstance(self.embeddings, AzureOpenAIEmbeddingService): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new code, adding the vectorizer
existing_index.vector_search.vectorizers is None | ||
or len(existing_index.vector_search.vectorizers) == 0 | ||
): | ||
if self.embeddings is not None and isinstance(self.embeddings, AzureOpenAIEmbeddingService): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new code, adding a vectorizer if one doesnt exist yet, to make it easy for folks to upgrade
azure-ai-documentintelligence | ||
azure-cognitiveservices-speech | ||
azure-search-documents==11.6.0b1 | ||
azure-search-documents==11.6.0b6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get any errors about code changes on the .search() side so I'm assuming that's largely unchanged in the latest versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
filterable=False, | ||
sortable=False, | ||
facetable=False, | ||
vector_search_dimensions=1024, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's down here
Purpose
Fixes #1825
Updates to latest Azure AI search package and specify embedding model name and dimensions
Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
Does this require changes to learn.microsoft.com docs?
This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest
).python -m pytest --cov
to verify 100% coverage of added linespython -m mypy
to check for type errorsruff
andblack
manually on my code.