Skip to content

Commit 4f41bdc

Browse files
committed
checkpoint 3
1 parent 2d1e006 commit 4f41bdc

File tree

3 files changed

+42
-12
lines changed

3 files changed

+42
-12
lines changed

articles/search/cognitive-search-concept-intro.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,17 @@ In Azure AI Search, *AI enrichment* refers to integration with [Azure AI service
1919

2020
Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAiEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
2121

22-
AI enrichment is based on *skills*.
22+
AI enrichment is based on [*skills*](cognitive-search-working-with-skillsets.md).
2323

24-
Built-in skills that tap Azure AI services apply the following transformation and processing to raw content:
24+
Built-in skills tap Azure AI services. They apply the following transformations and processing to raw content:
2525

2626
+ Translation and language detection for multi-lingual search
2727
+ Entity recognition to extract people names, places, and other entities from large chunks of text
2828
+ Key phrase extraction to identify and output important terms
2929
+ Optical Character Recognition (OCR) to recognize printed and handwritten text in binary files
3030
+ Image analysis to describe image content, and output the descriptions as searchable text fields
3131

32-
Custom skills running your external code can be used for transformations and processing that you want to include in the pipeline.
32+
Custom skills run your external code. Custom skills can be used for any custom processing that you want to include in the pipeline.
3333

3434
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md) that connects to Azure data sources. An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
3535

articles/search/search-what-is-an-index.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Index overview
2+
title: Search index overview
33
titleSuffix: Azure AI Search
44
description: Explains what is a search index in Azure AI Search and describes content, construction, physical expression, and the index schema.
55

@@ -14,7 +14,7 @@ ms.topic: conceptual
1414
ms.date: 01/19/2024
1515
---
1616

17-
# Indexes in Azure AI Search
17+
# Search indexes in Azure AI Search
1818

1919
In Azure AI Search, a *search index* is your searchable content, available to the search engine for indexing, full text search, vector search, hybrid search, and filtered queries. An index is defined by a schema and saved to the search service, with data import following as a second step. This content exists within your search service, apart from your primary data stores, which is necessary for the millisecond response times expected in modern search applications. Except for indexer-driven indexing scenarios, the search service never connects to or queries your source data.
2020

@@ -170,6 +170,13 @@ All indexing and query requests target an index. Endpoints are usually one of th
170170
| `<your-service>.search.windows.net/indexes` | Targets the indexes collection. Used when creating, listing, or deleting an index. Admin rights are required for these operations, available through admin [API keys](search-security-api-keys.md) or a [Search Contributor role](search-security-rbac.md#built-in-roles-used-in-search). |
171171
| `<your-service>.search.windows.net/indexes/<your-index>/docs` | Targets the documents collection of a single index. Used when querying an index or data refresh. For queries, read rights are sufficient, and available through query API keys or a data reader role. For data refresh, admin rights are required. |
172172

173+
Search subscribers, or the person who created the search service, can manage the search service in the Azure portal. An Azure subscription requires Contributor or above permissions to create or delete services. You can [sign in to the Azure portal](https://portal.azure.com) for a direct connection to your search service.
174+
175+
For other clients, we recommend reviewing the quickstarts for connection steps:
176+
177+
+ [Quickstart: REST](search-get-started-rest.md)
178+
+ [Quickstart: Azure SDKs](search-get-started-text.md)
179+
173180
## Next steps
174181

175182
You can get hands-on experience creating an index using almost any sample or walkthrough for Azure AI Search. For starters, you could choose any of the quickstarts from the table of contents.

articles/search/vector-store.md

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -95,9 +95,9 @@ In the following example, for each search document, there's one chunk ID, chunk,
9595

9696
### Schema for RAG and chat-style apps
9797

98-
If you're designing storage for generative search, you can create separate indexes for the static content that you indexed and vectorized, and a second index for converations that can be used in prompt flows. The following indexes are created from the [**chat-with-your-data-solution-accelerator**](https://github.com/Azure-Samples/azure-search-openai-solution-accelerator) accelerator.
98+
If you're designing storage for generative search, you can create separate indexes for the static content that you indexed and vectorized, and a second index for conversations that can be used in prompt flows. The following indexes are created from the [**chat-with-your-data-solution-accelerator**](https://github.com/Azure-Samples/azure-search-openai-solution-accelerator) accelerator.
9999

100-
:::image type="content" source="media/vector-search-overview/accelerator-indexes.png" alt-text="Screenshot of the indexes created by the acceletor.":::
100+
:::image type="content" source="media/vector-search-overview/accelerator-indexes.png" alt-text="Screenshot of the indexes created by the accelerator.":::
101101

102102
Fields from the chat index that support generative search experience:
103103

@@ -117,25 +117,48 @@ Fields from the chat index that support generative search experience:
117117

118118
Here's a screenshot showing [Search explorer](search-explorer.md) search results for the conversations index. The search score is 1.00 because the search was unqualified. Notice the fields that exist to support orchestration and prompt flows. A conversation ID identifies a specific chat. `"type"` indicates whether the content is from the user or the assistant. Dates are used to age out chats from the history.
119119

120-
:::image type="content" source="media/vetor-search-overview/vector-schema-search-results.png" alt-text="Screenshot of Search Explorer with results from an index designed for RAG apps.":::
120+
:::image type="content" source="media/vector-search-overview/vector-schema-search-results.png" alt-text="Screenshot of Search Explorer with results from an index designed for RAG apps.":::
121121

122122
## Physical structure and size
123123

124+
In Azure AI Search, the physical structure of an index is largely an internal implementation. You can access its schema, load and query its content, monitor its size, and manage capacity, but the clusters themselves (indexes, [shards](search-capacity-planning.md#concepts-search-units-replicas-partitions-shards), and other files and folders) are managed internally by Microsoft.
125+
126+
The size and substance of an index is determined by:
127+
128+
+ Quantity and composition of your documents
129+
+ Attributes on individual fields
130+
+ Index configuration, including vector configuration that specifies how the internal navigation structures are created based on whether you choose HNSW or exhaustive KNN for similarity search.
131+
124132
Vector store index limits and estimations are covered in [another article](vector-search-index-size.md), but it's highlighted here to emphasize that maximum storage varies by service tier, and also by when the search service was created. Newer same-tier services have significantly more capacity for vector indexes.
125133

126134
+ [Check the deployment date of your search service](vector-search-index-size.md#how-to-determine-service-creation-date). If it was created before July 1, 2023, consider creating a new search service for greater capacity.
127135

128-
+ [Choose a scaleable tier](search-sku-tier.md) if you anticipate fluctuations in vector storage requirements. The Basic tier is fixed at one partition. Consider Standard 1 (S1) and above for more flexibility and faster performance.
136+
+ [Choose a scalable tier](search-sku-tier.md) if you anticipate fluctuations in vector storage requirements. The Basic tier is fixed at one partition. Consider Standard 1 (S1) and above for more flexibility and faster performance.
129137

130-
In terms of usage metrics, a vector index is an internal data structure created for each vector field. As such, a vector sotrage is always a fraction of the overall index size. Other nonvector fields and data structures consume the remainder of the quota for index size and consumed storage at the service level.
138+
In terms of usage metrics, a vector index is an internal data structure created for each vector field. As such, a vector storage is always a fraction of the overall index size. Other nonvector fields and data structures consume the remainder of the quota for index size and consumed storage at the service level.
131139

132140
## Basic operations and interaction
133141

142+
This section introduces vector run time operations, including connecting to and securing a single index.
143+
144+
> [!NOTE]
145+
> When managing an index, be aware that there is no portal or API support for moving or copying an index. Instead, customers typically point their application deployment solution at a different search service (if using the same index name), or revise the name to create a copy on the current search service, and then build it.
146+
147+
### Continuously available
148+
149+
An index is immediately available for queries as soon as the first document is indexed, but won't be fully operational until all documents are indexed. Internally, an index is [distributed across partitions and executes on replicas](search-capacity-planning.md#concepts-search-units-replicas-partitions-shards). The physical index is managed internally. The logical index is managed by you.
150+
151+
An index is continuously available, with no ability to pause or take it offline. Because it's designed for continuous operation, any updates to its content, or additions to the index itself, happen in real time. As a result, queries might temporarily return incomplete results if a request coincides with a document update.
152+
153+
Notice that query continuity exists for document operations (refreshing or deleting) and for modifications that don't affect the existing structure and integrity of the current index (such as adding new fields). If you need to make structural updates (changing existing fields), those are typically managed using a drop-and-rebuild workflow in a development environment, or by creating a new version of the index on production service.
154+
155+
To avoid an [index rebuild](search-howto-reindex.md), some customers who are making small changes choose to "version" a field by creating a new one that coexists alongside a previous version. Over time, this leads to orphaned content in the form of obsolete fields or obsolete custom analyzer definitions, especially in a production index that is expensive to replicate. You can address these issues on planned updates to the index as part of index lifecycle management.
156+
134157
### Secure access to vector data
135158

136159
<!-- Azure AI Search supports comprehensive security. Authentication and authorization -->
137160

138-
## Manage vector stores
161+
### Manage vector stores
139162

140163
Azure provides a monitoring platform that includes diagnostic logging and alerting.
141164

@@ -146,6 +169,6 @@ Azure provides a monitoring platform that includes diagnostic logging and alerti
146169

147170
## See also
148171

149-
+ [Create a vector store using REST APIs](search-get-started-vector.md)
172+
+ [Create a vector store using REST APIs (Quickstart)](search-get-started-vector.md)
150173
+ [Create a vector store](vector-search-how-to-create-index.md)
151174
+ [Query a vector store](vector-search-how-to-query.md)

0 commit comments

Comments
 (0)