Skip to content

Conversation

pamelafox
Copy link
Collaborator

Purpose

Fixes #1825
Updates to latest Azure AI search package and specify embedding model name and dimensions

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[ ] Yes
[X] No

Does this require changes to learn.microsoft.com docs?

This repository is referenced by this tutorial
which includes deployment, settings and usage instructions. If text or screenshot need to change in the tutorial,
check the box below and notify the tutorial author. A Microsoft employee can do this for you if you're an external contributor.

[ ] Yes
[X] No

Type of change

[X] Bugfix
[ ] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

Code quality checklist

See CONTRIBUTING.md for more details.

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

if self.embeddings is None:
raise ValueError("Expecting Azure Open AI instance")

await search_manager.create_index(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we always make a vectorizer, even when not using integrated vectorization.

else SearchField(
name="id",

if self.search_info.index_name not in [name async for name in search_index_client.list_index_names()]:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code change looks larger than it is, because I moved all the index creation logic inside the if statement. Before, we would create the index object even if we didn't use it, which seemed an odd thing to do!

sortable=False,
facetable=False,
vector_search_dimensions=1024,
vector_search_dimensions=self.embedding_dimensions,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git is not diffing this field correctly, the imageEmbedding field is still 1024.

prioritized_fields=SemanticPrioritizedFields(
title_field=None, content_fields=[SemanticField(field_name="content")]
vectorizers = []
if self.embeddings and isinstance(self.embeddings, AzureOpenAIEmbeddingService):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new code, adding the vectorizer

existing_index.vector_search.vectorizers is None
or len(existing_index.vector_search.vectorizers) == 0
):
if self.embeddings is not None and isinstance(self.embeddings, AzureOpenAIEmbeddingService):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new code, adding a vectorizer if one doesnt exist yet, to make it easy for folks to upgrade

azure-ai-documentintelligence
azure-cognitiveservices-speech
azure-search-documents==11.6.0b1
azure-search-documents==11.6.0b6
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get any errors about code changes on the .search() side so I'm assuming that's largely unchanged in the latest versions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

filterable=False,
sortable=False,
facetable=False,
vector_search_dimensions=1024,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's down here

@pamelafox pamelafox merged commit 31f501a into Azure-Samples:main Oct 17, 2024
13 checks passed
@pamelafox pamelafox deleted the intvectfix branch October 17, 2024 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Embeddings vector dimensions mismatch indexer error
2 participants