You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/tutorial-rag-build-solution-index-schema.md
+34-15Lines changed: 34 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,47 +27,59 @@ In this tutorial, you:
27
27
28
28
## Prerequisites
29
29
30
-
-[Visual Studio Code](https://code.visualstudio.com/download) with the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) and the [Jupyter package](https://pypi.org/project/jupyter/). For more information, see [Python in Visual Studio Code](https://code.visualstudio.com/docs/languages/python).
30
+
[Visual Studio Code](https://code.visualstudio.com/download) with the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) and the [Jupyter package](https://pypi.org/project/jupyter/). For more information, see [Python in Visual Studio Code](https://code.visualstudio.com/docs/languages/python).
31
31
32
32
The output of this exercise is an index definition in JSON. At this point, it's not uploaded to Azure AI Search, so there are no requirements for cloud services or permissions in this exercise.
33
33
34
34
## Review schema considerations for RAG
35
35
36
36
In conversational search, LLMs compose the response that the user sees, not the search engine, so you don't need to think about what fields to show in your search results, and whether the representations of individual search documents are coherent to the user. Depending on the question, the LLM might return verbatim content from your index, or more likely, repackage the content for a better answer.
37
37
38
-
To generate a response, LLMs operate on chunks of content, and while they need to know where the chunk came from for citation purposes, what matters most is the quality of message inputs and its relevance to the user's question. Whether the chunks come from one document or a thousand, the LLM ingests the information or *grounding data*, and formulates the response using instructions provided in a system prompt.
38
+
### Focus on chunks
39
39
40
-
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search for highly relevant results, and nonvector fields for human-readable inputs to the LLM for conversational search. Nonvector "chunked" content in the search results becomes the grounding data sent to the LLM.
40
+
To generate a response, LLMs operate on chunks of content, and while they need to know where the chunk came from for citation purposes, what matters most is the quality of message inputs and its relevance to the user's question. Whether the chunks come from one document or a thousand, the LLM ingests the information or *grounding data*, and formulates the response using instructions provided in a system prompt.
41
41
42
-
An index that works best for RAG workloads has these qualities:
42
+
Chunks are the focus of the schema, and each chunk is the definitive element of a search document in a RAG pattern. You can think of your index as a large collection of chunks, as opposed to traditional search documents that have more structure and fields containing uniform content for a name field, versus a description field, versus a category field.
43
+
44
+
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search for highly relevant results, and nonvector fields for human-readable inputs to the LLM for conversational search. Nonvector chunked content in the search results becomes the grounding data sent to the LLM.
43
45
44
-
- Returns chunks that are relevant to the query and readable to the LLM.
46
+
### Checklist of schema considerations
45
47
46
-
LLMs can handle a certain level of dirty data in chunked data, such as mark up, redundancy, and incomplete strings. While chunks need to be readable, they don't need to be pristine.
48
+
An index that works best for RAG workloads has these qualities:
47
49
48
-
-Maintains a relationship between chunks of a document and properties of the parent document. For example, file name, file type, title, author, and so forth. Chunks in search results could be pulled from anywhere in the index. Association with the parent document that provides the chunk is useful for context, citations, and follow up queries.
50
+
-Returns chunks that are relevant to the query and readable to the LLM. LLMs can handle a certain level of dirty data in chunks, such as mark up, redundancy, and incomplete strings. While chunks need to be readable and relevant to the query, they don't need to be pristine.
49
51
50
-
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your field collection should define all of your searchable content.
52
+
- Maintains a parent-child relationship between chunks of a document and the properties of the parent document, such as the file name, file type, title, author, and so forth. To answer a query, chunks could be pulled from anywhere in the index. Association with the parent document providing the chunk is useful for context, citations, and follow up queries.
53
+
54
+
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your fields collection should define all of your searchable content.
51
55
52
56
- Your schema should be flat (no complex types or structures). This requirement is specific to the RAG pattern in Azure AI Search.
53
57
54
-
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use nested queries or parallel queries in your search logic to pull from both. This exercise includes templates for both parent-child elements in the same index, or parent-child elements in separate indexes that are connected at query time through sequential queries to different indexes.
58
+
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use sequential or parallel queries in your search logic to pull from both. This exercise includes templates for parent-child elements in the same index and in separate indexes, where information from the parent index is retrieved using a look up query.
55
59
56
60
Schema design affects storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-optimize.md) tutorial, we revisit schema design to consider narrow data types, attribution, and vector configurations that are more efficient.
57
61
58
-
##Create a basic index
62
+
### Sample content for this tutorial
59
63
60
-
Here's a minimal index definition for RAG solutions that supports vector and hybrid search. If you aren't using vectors, the index can be even simpler (see [Quickstart: Generative search (RAG)](search-get-started-rag.md) for an example).
64
+
The content you're indexing informs what fields are in the index.
61
65
62
-
1. Open Visual Studio Code and create a new file. It doesn't have to be a Python file type for this exercise.
66
+
In this tutorial, we use PDFs and content from the NASA Earth at Night ebook. The original ebook is large, over 100 pages and 35 MB in size. We broke it up into smaller PDFs, one per page of text, to stay under the REST API payload limit of 16 MB per API call.
63
67
64
-
1. Review the following example for an introduction to required elements. In a RAG pattern, elements consist of a name, key field (`"id"`), non-vector chunks for the LLM, vector chunks for similarity search by the search engine, and a `vectorSearch` configuration for the vector fields.
68
+
We omit image vectorization for this exercise.
65
69
66
-
Vector fields have [specific types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields) and extra attributes for embedding model dimensions and configuration. `Edm.Single` is a data type that works for the more commonly used LLMs. For more information about vector fields, see [Create a vector index](vector-search-how-to-create-index.md).
70
+
The sample content is descriptive and informative. It also mentions places, regions, and countries across the world. We can include skills in our indexing pipeline that extracts this information and loads it into a queryable and filterable `locations` field.
71
+
72
+
Because all of the chunks of text originate from the same parent (Earth at Night ebook), we don't need a separate index dedicated to parent fields. If we were indexing from multiple parent PDFs, we would want a parent-child index pair to capture PDF-specific fields (path, title, authors, publication date, summary) and then send look up queries to the parent index to retrieve those fields relevant to each chunk. We include an example of that parent-child index template in this exercise for comparison.
73
+
74
+
## Create a basic index
75
+
76
+
1. Open Visual Studio Code and create a new file. It doesn't have to be a Python file type for this exercise.
77
+
78
+
1. Here's a minimal index definition for RAG solutions that support vector and hybrid search. Review it for an introduction to required elements: name, fields, and a `vectorSearch` configuration for the vector fields.
@@ -85,6 +97,13 @@ Here's a minimal index definition for RAG solutions that supports vector and hyb
85
97
}
86
98
```
87
99
100
+
Fields must include key field (`"id"`) and should include vector chunks for similarity search, and nonvector chunks for the LLM. Metadata about the source file might be file path, creation date, or content type.
101
+
102
+
Vector fields have [specific types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields) and extra attributes for embedding model dimensions and configuration. `Edm.Single` is a data type that works for the more commonly used LLMs. For more information about vector fields, see [Create a vector index](vector-search-how-to-create-index.md).
103
+
104
+
1. Here's the index schema for the tutorial and the NASA ebook content. It's similar to the basic schema, but adds a parent ID and metadata. It also includes fields for storing generated content that's created in the indexing pipeline.
105
+
106
+
88
107
<!-- Objective:
89
108
90
109
- Design an index schema that generates results in a format that works for LLMs.
Copy file name to clipboardExpand all lines: articles/search/tutorial-rag-build-solution-models.md
+11-7Lines changed: 11 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,23 +30,25 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
30
30
31
31
## Prerequisites
32
32
33
-
-Use the Azure portal to deploy models and configure role assignments in the Azure cloud.
33
+
-The Azure portal, used to deploy models and configure role assignments in the Azure cloud.
34
34
35
-
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider imposes more permissions or requirements for deploying and accessing models.
35
+
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider has more role requirements for deploying and accessing models. Those are noted in the following steps.
36
36
37
-
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource) (used in this tutorial), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/). We use Azure OpenAI in this tutorial, but list the other Azure resources so that you know your options.
37
+
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/).
38
+
39
+
We use Azure OpenAI in this tutorial, but list the other Azure resources so that you know your options for integrated embedding.
38
40
39
41
- Azure AI Search, Basic tier or higher provides a [managed identity](search-howto-managed-identities-data-sources.md) used in role assignments.
40
42
41
-
To complete all of the tutorials in this series, the region must also support both Azure AI Search and the model provider. See supported regions for:
43
+
To complete all of the tutorials in this series, the region must support both Azure AI Search and the model provider. See supported regions for:
-[Azure AI Vision regions](/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#region-availability)
46
48
47
49
-[Azure AI Studio](/azure/ai-studio/reference/region-support) regions.
48
50
49
-
Azure AI Search is currently facing limited availability in some regions. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
51
+
Azure AI Search is currently facing limited availability in some regions, such as West Europe and West US 2/3. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
50
52
51
53
> [!TIP]
52
54
> Currently, the following regions provide the most overlap and have the most capacity: **East US**, **East US2**, and **South Central** in the Americas; **France Central** or **Switzerland North** in Europe; **Australia East** in Asia Pacific.
@@ -97,7 +99,7 @@ This tutorial series uses the following models and model providers:
97
99
- Text-embedding-ada-02 on Azure OpenAI for embeddings
98
100
- GPT-35-Turbo on Azure OpenAI for chat completion
99
101
100
-
You must have **Cognitive Services AI User** permissions to deploy models in Azure OpenAI.
102
+
You must have [**Cognitive Services OpenAI Contributor**](/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-contributor) or higher to deploy models in Azure OpenAI.
101
103
102
104
1. Go to [Azure OpenAI Studio](https://oai.azure.com/).
103
105
1. Select **Deployments** on the left menu.
@@ -120,11 +122,13 @@ Assign yourself and the search service identity permissions on Azure OpenAI. The
120
122
1. Find your Azure OpenAI resource.
121
123
1. Select **Access control (IAM)** on the left menu.
1. Select **Managed identity** and then select **Members**. Find the system-managed identity for your search service in the dropdown list.
125
127
1. Next, select **User, group, or service principal** and then select **Members**. Search for your user account and then select it from the dropdown list.
126
128
1. Select **Review and Assign** to create the role assignments.
127
129
130
+
For access to models on Azure AI Vision, assign **Cognitive Services OpenAI User**. For Azure AI Studio, assign **Azure AI Developer**.
131
+
128
132
## Use non-Azure models for embeddings
129
133
130
134
The pattern for integrating any embedding model is to wrap it in a custom skill and custom vectorizer. This section provides links to reference articles. For a code example that calls a non-Azure model, see [custom-embeddings demo](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/custom-vectorizer/readme.md).
0 commit comments