checkpoint 2

HeidiSteen · HeidiSteen · commit 2d1e006b03b5 · 2024-02-14T10:10:39.000-08:00
diff --git a/articles/search/TOC.yml b/articles/search/TOC.yml
@@ -127,7 +127,7 @@
     href: samples-rest.md
 - name: Concepts
   items:
-  - name: Storage (indexes)
+  - name: Storage
     items:
     - name: Search index
       href: search-what-is-an-index.md
@@ -137,7 +137,7 @@
       href: knowledge-store-concept-intro.md
     - name: Data import strategies
       href: search-what-is-data-import.md
-  - name: Enrichment (skills)
+  - name: Enrichment
     items:
     - name: Enrichment overview
       href: cognitive-search-concept-intro.md
@@ -147,7 +147,7 @@
       href: cognitive-search-working-with-skillsets.md
     - name: Integrated vectorization (preview)
       href: vector-search-integrated-vectorization.md
-  - name: Retrieval (queries)
+  - name: Retrieval
     items:
     - name: Full-text search
       href: search-lucene-query-architecture.md
diff --git a/articles/search/cognitive-search-concept-intro.md b/articles/search/cognitive-search-concept-intro.md
@@ -12,20 +12,25 @@ ms.custom:
 ms.topic: conceptual
 ms.date: 01/30/2024
 ---
+
 # AI enrichment in Azure AI Search
 
 In Azure AI Search, *AI enrichment* refers to integration with [Azure AI services](/azure/ai-services/what-are-ai-services) to process content that isn't searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed. 
 
-Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAiEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding.
+Because Azure AI Search is a text and vector search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios. Source content must be textual (you can't enrich vectors), but the content created by an enrichment pipeline can be vectorized and indexed in a vector store using skills like [Text Split skill](cognitive-search-skill-textsplit.md) for chunking and [AzureOpenAiEmbedding skill](cognitive-search-skill-azure-openai-embedding.md) for encoding. 
+
+AI enrichment is based on *skills*.
 
-Built-in skills apply the following transformation and processing to raw content:
+Built-in skills that tap Azure AI services apply the following transformation and processing to raw content:
 
 + Translation and language detection for multi-lingual search
 + Entity recognition to extract people names, places, and other entities from large chunks of text
 + Key phrase extraction to identify and output important terms
 + Optical Character Recognition (OCR) to recognize printed and handwritten text in binary files
 + Image analysis to describe image content, and output the descriptions as searchable text fields
 
+Custom skills running your external code can be used for transformations and processing that you want to include in the pipeline.
+
 AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md) that connects to Azure data sources. An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
 
 The following diagram shows the progression of AI enrichment:
diff --git a/articles/search/media/vector-search-overview/accelerator-indexes.png b/articles/search/media/vector-search-overview/accelerator-indexes.png
diff --git a/articles/search/media/vector-search-overview/vector-schema-search-results.png b/articles/search/media/vector-search-overview/vector-schema-search-results.png
diff --git a/articles/search/vector-search-filters.md b/articles/search/vector-search-filters.md
@@ -16,7 +16,7 @@ ms.date: 02/14/2024
 
 You can set a [**vector filter modes on a vector query**](vector-search-how-to-query.md) to specify whether you want filtering before or after query execution. 
 
-Filters set the scope of a vector query. Filter are set on and iterate over nonvector string and numeric fields attributed as `filterable` in the index, but the effects of filtering determine *what* the vector query executes over: the searchable space, or the contents of the search results.
+Filters determine the scope of a vector query. Filters are set on and iterate over nonvector string and numeric fields attributed as `filterable` in the index, but the purpose of a filter determines *what* the vector query executes over: the entire searchable space, or the contents of a search result.
 
 This article describes each filter mode and provides guidance on when to use each one.
 
diff --git a/articles/search/vector-store.md b/articles/search/vector-store.md
@@ -35,15 +35,15 @@ In Azure AI Search, there are two patterns for working with search results.
 
 Your index schema should reflect your primary use case.
 
-## Schema designs for each retrieval pattern
+## Schema of a vector store
 
 The following examples highlight the differences in field composition for solutions build for generative AI or classic search.
 
 An index schema for a vector store requires a name, a key field (string), one or more vector fields, and a vector configuration. Nonvector fields are recommended for hybrid queries, or for returning verbatim human readable content that doesn't have to go through a language model. For instructions about vector configuration, see [Create a vector store](vector-search-how-to-create-index.md).
 
 ### Basic vector field configuration
 
-A vector field, such as "content_vector" in the following example, is of type `Collection(Edm.Single)`. It must be searchable and retrievable. It can't be filterable, facetable, or sortable, and it can't have analyzers, normalizers, or synonym map assignments. It must have dimensions set to the number of embeddings generated by the embedding model. For instance, if you're using text-embedding-ada-002, it generates 1,536 embeddings. A vector search profile is specified in a separate vector search configuration and assigned to a vector field using a profile name.
+A vector field, such as `"content_vector"` in the following example, is of type `Collection(Edm.Single)`. It must be searchable and retrievable. It can't be filterable, facetable, or sortable, and it can't have analyzers, normalizers, or synonym map assignments. It must have dimensions set to the number of embeddings generated by the embedding model. For instance, if you're using text-embedding-ada-002, it generates 1,536 embeddings. A vector search profile is specified in a separate vector search configuration and assigned to a vector field using a profile name.
 
 ```json
 {
@@ -56,9 +56,11 @@ A vector field, such as "content_vector" in the following example, is of type `C
 }
 ```
 
-Within the fields collection with other fields, content (nonvector) fields are useful for human readable text returned directly from the search engine. Although if you're using language models exclusively for response formulation, you can skip nonvector content fields. 
+### Fields collection for basic vector workloads
 
-The following example assumes that "content" is the human readable equivalent of the "content_vector" field. The key field is "id" in this example. Metadata fields are useful for filters, especially if metadata includes origin information about the source document. You can't filter on a vector field directly, but you can set prefilter or postfilter modes to filter before or after vector query execution.
+Here's an example showing a vector field in context, with other fields in a collection. 
+
+The key field (required) is `"id"` in this example. The `"content"` field is the human readable equivalent of the `"content_vector"` field. Although if you're using language models exclusively for response formulation, you can skip nonvector content fields. Metadata fields are useful for filters, especially if metadata includes origin information about the source document. You can't filter on a vector field directly, but you can set prefilter or postfilter modes to filter before or after vector query execution.
 
 ```json
 "name": "example-basic-vector-idx",
@@ -91,33 +93,51 @@ In the following example, for each search document, there's one chunk ID, chunk,
 ]
 ```
 
-## Sizing a vector index
+### Schema for RAG and chat-style apps
+
+If you're designing storage for generative search, you can create separate indexes for the static content that you indexed and vectorized, and a second index for converations that can be used in prompt flows. The following indexes are created from the [**chat-with-your-data-solution-accelerator**](https://github.com/Azure-Samples/azure-search-openai-solution-accelerator) accelerator.
 
-Vector stores are subject to limits imposed at the service-level. Maximum storage varies by service tier, and also by when the search service was created. Newer same-tier services have significantly more capacity for vector indexes.
+:::image type="content" source="media/vector-search-overview/accelerator-indexes.png" alt-text="Screenshot of the indexes created by the acceletor.":::
 
-+ Check the deployment date of your search service. If it was created before July 1, 2023, consider creating a new search service for greater capacity.
+Fields from the chat index that support generative search experience:
+
+```json
+"name": "example-index-from-accelerator",
+"fields": [
+  { "name": "id", "type": "Edm.String", "searchable": false, "filterable": true, "retrievable": true },
+  { "name": "content", "type": "Edm.String", "searchable": true, "filterable": false, "retrievable": true },
+  { "name": "content_vector", "type": "Collection(Edm.Single)", "searchable": true, "retrievable": true, "dimensions": 1536, "vectorSearchProfile": "my-vector-profile"},
+  { "name": "metadata", "type": "Edm.String", "searchable": true, "filterable": false, "retrievable": true },
+  { "name": "title", "type": "Edm.String", "searchable": true, "filterable": true, "retrievable": true, "facetable": true },
+  { "name": "source", "type": "Edm.String", "searchable": true, "filterable": true, "retrievable": true  },
+  { "name": "chunk", "type": "Edm.Int32", "searchable": false, "filterable": true, "retrievable": true },
+  { "name": "offset", "type": "Edm.Int32", "searchable": false, "filterable": true, "retrievable": true }
+]
+```
 
-+ Review [Vector index size limits](Vector index size limits.md).
+Here's a screenshot showing [Search explorer](search-explorer.md) search results for the conversations index. The search score is 1.00 because the search was unqualified. Notice the fields that exist to support orchestration and prompt flows. A conversation ID identifies a specific chat. `"type"` indicates whether the content is from the user or the assistant. Dates are used to age out chats from the history.
 
-In terms of usage metrics, a vector index is an internal data structure created for each vector field. As such, a vector index will always be a fraction of the overall index size, as other nonvector fields and data structures consume the remainder of the quota.
+:::image type="content" source="media/vetor-search-overview/vector-schema-search-results.png" alt-text="Screenshot of Search Explorer with results from an index designed for RAG apps.":::
 
-The size of vector indexes is measured in bytes. The size constraints are based on memory reserved for vector search, but also have implications for storage at the service level. Size constraints vary by service tier (or SKU).
+## Physical structure and size
 
-## Vector data retrieval
+Vector store index limits and estimations are covered in [another article](vector-search-index-size.md), but it's highlighted here to emphasize that maximum storage varies by service tier, and also by when the search service was created. Newer same-tier services have significantly more capacity for vector indexes.
 
-Key points to keep in mind for vector retrieval:
++ [Check the deployment date of your search service](vector-search-index-size.md#how-to-determine-service-creation-date). If it was created before July 1, 2023, consider creating a new search service for greater capacity.
 
-+ The vector search algorithms specify the navigation structures used at query time. The structures are created during indexing, but used during queries.
++ [Choose a scaleable tier](search-sku-tier.md) if you anticipate fluctuations in vector storage requirements. The Basic tier is fixed at one partition. Consider Standard 1 (S1) and above for more flexibility and faster performance.
 
-+ The content of your vector fields is determined by the [embedding step](vector-search-how-to-generate-embeddings.md) that vectorizes or encodes your content. If you use the same embedding model for all of your fields, you can [build vector queries](vector-search-how-to-query.md) that cover all of them in a single request. 
+In terms of usage metrics, a vector index is an internal data structure created for each vector field. As such, a vector sotrage is always a fraction of the overall index size. Other nonvector fields and data structures consume the remainder of the quota for index size and consumed storage at the service level.
 
-+ If you use search results as grounding data, where a chat model generates the answer to a query, design a schema that stores chunks of text. Data chunking is a requirement if source files are too large for the embedding model. It's also efficient for chat if the original source files contain a varied information. 
+## Basic operations and interaction
 
 ### Secure access to vector data
 
 <!-- Azure AI Search supports comprehensive security. Authentication and authorization -->
 
-## Manage vector indexes
+## Manage vector stores
+
+Azure provides a monitoring platform that includes diagnostic logging and alerting.
 
 + Enable logging
 + Set up alerts
@@ -126,7 +146,6 @@ Key points to keep in mind for vector retrieval:
 
 ## See also
 
-+ [Quickstart: Vector search using REST APIs](search-get-started-vector.md)
-+ [Vector store creation](vector-search-how-to-create-index.md)
-+ [Vector query creation](vector-search-how-to-query.md)
-+ [Azure Cognitive Search and LangChain: A Seamless Integration for Enhanced Vector Search Capabilities](https://techcommunity.microsoft.com/t5/azure-ai-services-blog/azure-cognitive-search-and-langchain-a-seamless-integration-for/ba-p/3901448)
++ [Create a vector store using REST APIs](search-get-started-vector.md)
++ [Create a vector store](vector-search-how-to-create-index.md)
++ [Query a vector store](vector-search-how-to-query.md)