You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-intro.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,17 +19,17 @@ In Azure AI Search, *AI enrichment* refers to integration with [Azure AI service
19
19
20
20
Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAiEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
21
21
22
-
AI enrichment is based on *skills*.
22
+
AI enrichment is based on [*skills*](cognitive-search-working-with-skillsets.md).
23
23
24
-
Built-in skills that tap Azure AI servicesapply the following transformation and processing to raw content:
24
+
Built-in skills tap Azure AI services. They apply the following transformations and processing to raw content:
25
25
26
26
+ Translation and language detection for multi-lingual search
27
27
+ Entity recognition to extract people names, places, and other entities from large chunks of text
28
28
+ Key phrase extraction to identify and output important terms
29
29
+ Optical Character Recognition (OCR) to recognize printed and handwritten text in binary files
30
30
+ Image analysis to describe image content, and output the descriptions as searchable text fields
31
31
32
-
Custom skills running your external codecan be used for transformations and processing that you want to include in the pipeline.
32
+
Custom skills run your external code. Custom skills can be used for any custom processing that you want to include in the pipeline.
33
33
34
34
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md) that connects to Azure data sources. An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
Copy file name to clipboardExpand all lines: articles/search/search-what-is-an-index.md
+9-2Lines changed: 9 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Index overview
2
+
title: Search index overview
3
3
titleSuffix: Azure AI Search
4
4
description: Explains what is a search index in Azure AI Search and describes content, construction, physical expression, and the index schema.
5
5
@@ -14,7 +14,7 @@ ms.topic: conceptual
14
14
ms.date: 01/19/2024
15
15
---
16
16
17
-
# Indexes in Azure AI Search
17
+
# Search indexes in Azure AI Search
18
18
19
19
In Azure AI Search, a *search index* is your searchable content, available to the search engine for indexing, full text search, vector search, hybrid search, and filtered queries. An index is defined by a schema and saved to the search service, with data import following as a second step. This content exists within your search service, apart from your primary data stores, which is necessary for the millisecond response times expected in modern search applications. Except for indexer-driven indexing scenarios, the search service never connects to or queries your source data.
20
20
@@ -170,6 +170,13 @@ All indexing and query requests target an index. Endpoints are usually one of th
170
170
|`<your-service>.search.windows.net/indexes`| Targets the indexes collection. Used when creating, listing, or deleting an index. Admin rights are required for these operations, available through admin [API keys](search-security-api-keys.md) or a [Search Contributor role](search-security-rbac.md#built-in-roles-used-in-search). |
171
171
|`<your-service>.search.windows.net/indexes/<your-index>/docs`| Targets the documents collection of a single index. Used when querying an index or data refresh. For queries, read rights are sufficient, and available through query API keys or a data reader role. For data refresh, admin rights are required. |
172
172
173
+
Search subscribers, or the person who created the search service, can manage the search service in the Azure portal. An Azure subscription requires Contributor or above permissions to create or delete services. You can [sign in to the Azure portal](https://portal.azure.com) for a direct connection to your search service.
174
+
175
+
For other clients, we recommend reviewing the quickstarts for connection steps:
You can get hands-on experience creating an index using almost any sample or walkthrough for Azure AI Search. For starters, you could choose any of the quickstarts from the table of contents.
Copy file name to clipboardExpand all lines: articles/search/vector-store.md
+30-7Lines changed: 30 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,9 +95,9 @@ In the following example, for each search document, there's one chunk ID, chunk,
95
95
96
96
### Schema for RAG and chat-style apps
97
97
98
-
If you're designing storage for generative search, you can create separate indexes for the static content that you indexed and vectorized, and a second index for converations that can be used in prompt flows. The following indexes are created from the [**chat-with-your-data-solution-accelerator**](https://github.com/Azure-Samples/azure-search-openai-solution-accelerator) accelerator.
98
+
If you're designing storage for generative search, you can create separate indexes for the static content that you indexed and vectorized, and a second index for conversations that can be used in prompt flows. The following indexes are created from the [**chat-with-your-data-solution-accelerator**](https://github.com/Azure-Samples/azure-search-openai-solution-accelerator) accelerator.
99
99
100
-
:::image type="content" source="media/vector-search-overview/accelerator-indexes.png" alt-text="Screenshot of the indexes created by the acceletor.":::
100
+
:::image type="content" source="media/vector-search-overview/accelerator-indexes.png" alt-text="Screenshot of the indexes created by the accelerator.":::
101
101
102
102
Fields from the chat index that support generative search experience:
103
103
@@ -117,25 +117,48 @@ Fields from the chat index that support generative search experience:
117
117
118
118
Here's a screenshot showing [Search explorer](search-explorer.md) search results for the conversations index. The search score is 1.00 because the search was unqualified. Notice the fields that exist to support orchestration and prompt flows. A conversation ID identifies a specific chat. `"type"` indicates whether the content is from the user or the assistant. Dates are used to age out chats from the history.
119
119
120
-
:::image type="content" source="media/vetor-search-overview/vector-schema-search-results.png" alt-text="Screenshot of Search Explorer with results from an index designed for RAG apps.":::
120
+
:::image type="content" source="media/vector-search-overview/vector-schema-search-results.png" alt-text="Screenshot of Search Explorer with results from an index designed for RAG apps.":::
121
121
122
122
## Physical structure and size
123
123
124
+
In Azure AI Search, the physical structure of an index is largely an internal implementation. You can access its schema, load and query its content, monitor its size, and manage capacity, but the clusters themselves (indexes, [shards](search-capacity-planning.md#concepts-search-units-replicas-partitions-shards), and other files and folders) are managed internally by Microsoft.
125
+
126
+
The size and substance of an index is determined by:
127
+
128
+
+ Quantity and composition of your documents
129
+
+ Attributes on individual fields
130
+
+ Index configuration, including vector configuration that specifies how the internal navigation structures are created based on whether you choose HNSW or exhaustive KNN for similarity search.
131
+
124
132
Vector store index limits and estimations are covered in [another article](vector-search-index-size.md), but it's highlighted here to emphasize that maximum storage varies by service tier, and also by when the search service was created. Newer same-tier services have significantly more capacity for vector indexes.
125
133
126
134
+[Check the deployment date of your search service](vector-search-index-size.md#how-to-determine-service-creation-date). If it was created before July 1, 2023, consider creating a new search service for greater capacity.
127
135
128
-
+[Choose a scaleable tier](search-sku-tier.md) if you anticipate fluctuations in vector storage requirements. The Basic tier is fixed at one partition. Consider Standard 1 (S1) and above for more flexibility and faster performance.
136
+
+[Choose a scalable tier](search-sku-tier.md) if you anticipate fluctuations in vector storage requirements. The Basic tier is fixed at one partition. Consider Standard 1 (S1) and above for more flexibility and faster performance.
129
137
130
-
In terms of usage metrics, a vector index is an internal data structure created for each vector field. As such, a vector sotrage is always a fraction of the overall index size. Other nonvector fields and data structures consume the remainder of the quota for index size and consumed storage at the service level.
138
+
In terms of usage metrics, a vector index is an internal data structure created for each vector field. As such, a vector storage is always a fraction of the overall index size. Other nonvector fields and data structures consume the remainder of the quota for index size and consumed storage at the service level.
131
139
132
140
## Basic operations and interaction
133
141
142
+
This section introduces vector run time operations, including connecting to and securing a single index.
143
+
144
+
> [!NOTE]
145
+
> When managing an index, be aware that there is no portal or API support for moving or copying an index. Instead, customers typically point their application deployment solution at a different search service (if using the same index name), or revise the name to create a copy on the current search service, and then build it.
146
+
147
+
### Continuously available
148
+
149
+
An index is immediately available for queries as soon as the first document is indexed, but won't be fully operational until all documents are indexed. Internally, an index is [distributed across partitions and executes on replicas](search-capacity-planning.md#concepts-search-units-replicas-partitions-shards). The physical index is managed internally. The logical index is managed by you.
150
+
151
+
An index is continuously available, with no ability to pause or take it offline. Because it's designed for continuous operation, any updates to its content, or additions to the index itself, happen in real time. As a result, queries might temporarily return incomplete results if a request coincides with a document update.
152
+
153
+
Notice that query continuity exists for document operations (refreshing or deleting) and for modifications that don't affect the existing structure and integrity of the current index (such as adding new fields). If you need to make structural updates (changing existing fields), those are typically managed using a drop-and-rebuild workflow in a development environment, or by creating a new version of the index on production service.
154
+
155
+
To avoid an [index rebuild](search-howto-reindex.md), some customers who are making small changes choose to "version" a field by creating a new one that coexists alongside a previous version. Over time, this leads to orphaned content in the form of obsolete fields or obsolete custom analyzer definitions, especially in a production index that is expensive to replicate. You can address these issues on planned updates to the index as part of index lifecycle management.
156
+
134
157
### Secure access to vector data
135
158
136
159
<!-- Azure AI Search supports comprehensive security. Authentication and authorization -->
137
160
138
-
## Manage vector stores
161
+
###Manage vector stores
139
162
140
163
Azure provides a monitoring platform that includes diagnostic logging and alerting.
141
164
@@ -146,6 +169,6 @@ Azure provides a monitoring platform that includes diagnostic logging and alerti
146
169
147
170
## See also
148
171
149
-
+[Create a vector store using REST APIs](search-get-started-vector.md)
172
+
+[Create a vector store using REST APIs (Quickstart)](search-get-started-vector.md)
150
173
+[Create a vector store](vector-search-how-to-create-index.md)
151
174
+[Query a vector store](vector-search-how-to-query.md)
0 commit comments