Skip to content

Commit 842183d

Browse files
author
Jill Grant
authored
Merge pull request #266276 from HeidiSteen/heidist-docs
[azure search] Vector store doc updates
2 parents 1e0a7ea + f9f6ba4 commit 842183d

12 files changed

+186
-41
lines changed

articles/search/TOC.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@
127127
href: samples-rest.md
128128
- name: Concepts
129129
items:
130-
- name: Storage (indexes)
130+
- name: Storage
131131
items:
132132
- name: Search index
133133
href: search-what-is-an-index.md
@@ -137,7 +137,7 @@
137137
href: knowledge-store-concept-intro.md
138138
- name: Data import strategies
139139
href: search-what-is-data-import.md
140-
- name: Enrichment (skills)
140+
- name: Enrichment
141141
items:
142142
- name: Enrichment overview
143143
href: cognitive-search-concept-intro.md
@@ -147,7 +147,7 @@
147147
href: cognitive-search-working-with-skillsets.md
148148
- name: Integrated vectorization (preview)
149149
href: vector-search-integrated-vectorization.md
150-
- name: Retrieval (queries)
150+
- name: Retrieval
151151
items:
152152
- name: Full-text search
153153
href: search-lucene-query-architecture.md

articles/search/cognitive-search-concept-intro.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,20 +12,25 @@ ms.custom:
1212
ms.topic: conceptual
1313
ms.date: 01/30/2024
1414
---
15+
1516
# AI enrichment in Azure AI Search
1617

1718
In Azure AI Search, *AI enrichment* refers to integration with [Azure AI services](/azure/ai-services/what-are-ai-services) to process content that isn't searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed.
1819

19-
Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAiEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
20+
Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAiEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
21+
22+
AI enrichment is based on [*skills*](cognitive-search-working-with-skillsets.md).
2023

21-
Built-in skills apply the following transformation and processing to raw content:
24+
Built-in skills tap Azure AI services. They apply the following transformations and processing to raw content:
2225

2326
+ Translation and language detection for multi-lingual search
2427
+ Entity recognition to extract people names, places, and other entities from large chunks of text
2528
+ Key phrase extraction to identify and output important terms
2629
+ Optical Character Recognition (OCR) to recognize printed and handwritten text in binary files
2730
+ Image analysis to describe image content, and output the descriptions as searchable text fields
2831

32+
Custom skills run your external code. Custom skills can be used for any custom processing that you want to include in the pipeline.
33+
2934
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md) that connects to Azure data sources. An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
3035

3136
The following diagram shows the progression of AI enrichment:
18.2 KB
Loading
101 KB
Loading
33.5 KB
Loading
19 KB
Loading
53.7 KB
Loading

articles/search/search-what-is-an-index.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Index overview
2+
title: Search index overview
33
titleSuffix: Azure AI Search
44
description: Explains what is a search index in Azure AI Search and describes content, construction, physical expression, and the index schema.
55

@@ -14,7 +14,7 @@ ms.topic: conceptual
1414
ms.date: 01/19/2024
1515
---
1616

17-
# Indexes in Azure AI Search
17+
# Search indexes in Azure AI Search
1818

1919
In Azure AI Search, a *search index* is your searchable content, available to the search engine for indexing, full text search, vector search, hybrid search, and filtered queries. An index is defined by a schema and saved to the search service, with data import following as a second step. This content exists within your search service, apart from your primary data stores, which is necessary for the millisecond response times expected in modern search applications. Except for indexer-driven indexing scenarios, the search service never connects to or queries your source data.
2020

@@ -170,6 +170,13 @@ All indexing and query requests target an index. Endpoints are usually one of th
170170
| `<your-service>.search.windows.net/indexes` | Targets the indexes collection. Used when creating, listing, or deleting an index. Admin rights are required for these operations, available through admin [API keys](search-security-api-keys.md) or a [Search Contributor role](search-security-rbac.md#built-in-roles-used-in-search). |
171171
| `<your-service>.search.windows.net/indexes/<your-index>/docs` | Targets the documents collection of a single index. Used when querying an index or data refresh. For queries, read rights are sufficient, and available through query API keys or a data reader role. For data refresh, admin rights are required. |
172172

173+
Search subscribers, or the person who created the search service, can manage the search service in the Azure portal. An Azure subscription requires Contributor or above permissions to create or delete services. You can [sign in to the Azure portal](https://portal.azure.com) for a direct connection to your search service.
174+
175+
For other clients, we recommend reviewing the quickstarts for connection steps:
176+
177+
+ [Quickstart: REST](search-get-started-rest.md)
178+
+ [Quickstart: Azure SDKs](search-get-started-text.md)
179+
173180
## Next steps
174181

175182
You can get hands-on experience creating an index using almost any sample or walkthrough for Azure AI Search. For starters, you could choose any of the quickstarts from the table of contents.

articles/search/vector-search-filters.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,14 @@ ms.service: cognitive-search
99
ms.custom:
1010
- ignite-2023
1111
ms.topic: conceptual
12-
ms.date: 11/01/2023
12+
ms.date: 02/14/2024
1313
---
1414

1515
# Filters in vector queries
1616

17-
You can set a [vector filter modes on a vector query](vector-search-how-to-query.md) to specify whether you want filtering before or after query execution. Filters are set on and iterate over string and numeric fields attributed as `filterable` in the index, but the effects of filtering determine *what* the vector query executes over: the searchable space, or the documents in the search results.
17+
You can set a [**vector filter modes on a vector query**](vector-search-how-to-query.md) to specify whether you want filtering before or after query execution.
18+
19+
Filters determine the scope of a vector query. Filters are set on and iterate over nonvector string and numeric fields attributed as `filterable` in the index, but the purpose of a filter determines *what* the vector query executes over: the entire searchable space, or the contents of a search result.
1820

1921
This article describes each filter mode and provides guidance on when to use each one.
2022

@@ -40,21 +42,21 @@ To understand the conditions under which one filter mode performs better than th
4042

4143
For the small and medium workloads, we used a Standard 2 (S2) service with one partition and one replica. For the large workload, we used a Standard 3 (S3) service with 12 partitions and one replica.
4244

43-
Indexes had an identical construction: one key field, one vector field, one text field, and one numeric filterable field.
45+
Indexes had an identical construction: one key field, one vector field, one text field, and one numeric filterable field. The following index is defined using the 2023-07-01-preview syntax.
4446

4547
```python
4648
def get_index_schema(self, index_name, dimensions):
4749
return {
4850
"name": index_name,
4951
"fields": [
5052
{"name": "id", "type": "Edm.String", "key": True, "searchable": True},
51-
{"name": "myvector", "type": "Collection(Edm.Single)", "dimensions": dimensions,
53+
{"name": "content_vector", "type": "Collection(Edm.Single)", "dimensions": dimensions,
5254
"searchable": True, "retrievable": True, "filterable": False, "facetable": False, "sortable": False,
5355
"vectorSearchConfiguration": "defaulthnsw"},
5456
{"name": "text", "type": "Edm.String", "searchable": True, "filterable": False, "retrievable": True,
55-
"sortable": False, "facetable": False, "key": False},
57+
"sortable": False, "facetable": False},
5658
{"name": "score", "type": "Edm.Double", "searchable": False, "filterable": True,
57-
"retrievable": True, "sortable": True, "facetable": True, "key": False}
59+
"retrievable": True, "sortable": True, "facetable": True}
5860
],
5961
"vectorSearch":
6062
{

articles/search/vector-search-how-to-create-index.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -696,6 +696,22 @@ api-key: {{admin-api-key}}
696696

697697
---
698698

699+
## Update a vector store
700+
701+
To update a vector store, modify the schema and if necessary, reload documents to populate new fields. APIs for schema updates include [Create or Update Index (REST)](/rest/api/searchservice/indexes/create-or-update), [CreateOrUpdateIndex](/dotnet/api/azure.search.documents.indexes.searchindexclient.createorupdateindexasync) in the Azure SDK for .NET, [create_or_update_index](/python/api/azure-search-documents/azure.search.documents.indexes.searchindexclient?view=azure-python#azure-search-documents-indexes-searchindexclient-create-or-update-index&preserve-view=true) in the Azure SDK for Python, and similar methods in other Azure SDKs.
702+
703+
The standard guidance for updating an index is covered in [Drop and rebuild an index](search-howto-reindex.md).
704+
705+
Key points include:
706+
707+
+ Drop and rebuild is often required for updates to and deletion of existing fields.
708+
709+
+ However, you can update an existing schema with the following modifications, with no rebuild required:
710+
711+
+ Add new fields to a fields collection.
712+
+ Add new vector configurations, assigned to new fields but not existing fields that have already been vectorized.
713+
+ Change "retrievable" (values are true or false) on an existing field. Vector fields must be searchable and retrievable, but if you want to disable access to a vector field in situations where drop and rebuild isn't feasible, you can set retrievable to false.
714+
699715
## Next steps
700716

701717
As a next step, we recommend [Query vector data in a search index](vector-search-how-to-query.md).

0 commit comments

Comments
 (0)