You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/tutorial-rag-build-solution-index-schema.md
+18-48Lines changed: 18 additions & 48 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,9 +21,9 @@ In this tutorial, you:
21
21
> [!div class="checklist"]
22
22
> - Learn the characteristics of an index schema built for RAG
23
23
> - Create an index that accommodate vectors and hybrid queries
24
-
> - Add semantic ranking and filters
25
24
> - Add vector profiles and configurations
26
25
> - Add structured data
26
+
> - Add filters
27
27
28
28
## Prerequisites
29
29
@@ -37,25 +37,31 @@ In conversational search, LLMs compose the response that the user sees, not the
37
37
38
38
To generate a response, LLMs operate on chunks of content, and while they need to know where the chunk came from for citation purposes, what matters most is the quality of message inputs and its relevance to the user's question. Whether the chunks come from one document or a thousand, the LLM ingests the information or *grounding data*, and formulates the response using instructions provided in a system prompt.
39
39
40
-
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search, and nonvector fields for human-readable results. Nonvector "chunked" content in the search results becomes the grounding data sent to the LLM.
40
+
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search for highly relevant results, and nonvector fields for human-readable inputs to the LLM for conversational search. Nonvector "chunked" content in the search results becomes the grounding data sent to the LLM.
41
41
42
-
A schema that works best for RAG workloads:
42
+
An index that works best for RAG workloads has these qualities:
43
43
44
-
- Returns chunks of content in search results.
45
-
- Search results go to an LLM, which can handle a certain level of redundancy and incomplete strings.
46
-
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors.
47
-
- Your schema should be flat (no complex types or structures).
48
-
- You can only query one index at a time (no joins), but you can create indexes that preserve parent-child relationship, and then use nested queries or parallel queries in your search logic to pull from both. This exercise includes templates for both.
44
+
- Returns chunks that are relevant to the query and readable to the LLM.
49
45
50
-
Schema design has an impact on storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-optimize.md) tutorial, we revisit schema design to consider narrow data types, attribution, and vector configurations that are more efficient.
46
+
LLMs can handle a certain level of dirty data in chunked data, such as mark up, redundancy, and incomplete strings. While chunks need to be readable, they don't need to be pristine.
47
+
48
+
- Maintains a relationship between chunks of a document and properties of the parent document. For example, file name, file type, title, author, and so forth. Chunks in search results could be pulled from anywhere in the index. Association with the parent document that provides the chunk is useful for context, citations, and follow up queries.
49
+
50
+
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your field collection should define all of your searchable content.
51
+
52
+
- Your schema should be flat (no complex types or structures). This requirement is specific to the RAG pattern in Azure AI Search.
53
+
54
+
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use nested queries or parallel queries in your search logic to pull from both. This exercise includes templates for both parent-child elements in the same index, or parent-child elements in separate indexes that are connected at query time through sequential queries to different indexes.
55
+
56
+
Schema design affects storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-optimize.md) tutorial, we revisit schema design to consider narrow data types, attribution, and vector configurations that are more efficient.
51
57
52
58
## Create a basic index
53
59
54
60
Here's a minimal index definition for RAG solutions that supports vector and hybrid search. If you aren't using vectors, the index can be even simpler (see [Quickstart: Generative search (RAG)](search-get-started-rag.md) for an example).
55
61
56
62
1. Open Visual Studio Code and create a new file. It doesn't have to be a Python file type for this exercise.
57
63
58
-
1. Review the following example for an introduction to required elements. These elements consist of a name, key field ("id"), non-vector chunks for the LLM, vector chunks for similarity search by the search engine, and a `vectorSearch` configuration for the vector fields.
64
+
1. Review the following example for an introduction to required elements. In a RAG pattern, elements consist of a name, key field (`"id"`), non-vector chunks for the LLM, vector chunks for similarity search by the search engine, and a `vectorSearch` configuration for the vector fields.
59
65
60
66
Vector fields have [specific types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields) and extra attributes for embedding model dimensions and configuration. `Edm.Single` is a data type that works for the more commonly used LLMs. For more information about vector fields, see [Create a vector index](vector-search-how-to-create-index.md).
61
67
@@ -79,7 +85,6 @@ Here's a minimal index definition for RAG solutions that supports vector and hyb
79
85
}
80
86
```
81
87
82
-
1.
83
88
<!-- Objective:
84
89
85
90
- Design an index schema that generates results in a format that works for LLMs.
@@ -105,44 +110,9 @@ Tasks:
105
110
106
111
<!--
107
112
108
-
ps: We have another physical resource limit for our services: vector index size. HNSW requires vector indices to reside entirely in memory. "Vector index size" is our customer-facing resource limit that governs the memory consumed by their vector data. (and this is a big reason why the beefiest VMs have 512 GB of RAM). Increasing partitions also increases the amount of vector quota for customers as well.
109
-
110
-
111
-
## Old introduction
112
-
113
-
114
-
`sidenote: the following applies to the non-basic index, which might be out of scope`.
115
-
*A richer index has more fields and configurations, and is often better because extra fields support richer queries and more opportunities for relevance tuning. Filters and scoring profiles for boosting apply to nonvector fields. If you have content that should be matched precisely and not similarly, such as a name or employee number, then create fields to contain that information.*
116
-
117
-
118
-
## Create a basic index
119
-
120
-
1. Create an index definition with required elements. The index requires a key field ("id"). It includes vector and nonvector chunks of text. Vector content is used for similarity search. Nonvector content is returned in results and will be passed in messages to the LLM. The vector search configuration defines the algorithms used for a vector query.
121
-
122
-
```http
123
-
### Create an index for RAG scenarios
124
-
POST {{baseUrl}}/indexes?api-version=2024-05-01-preview HTTP/1.1
125
-
Content-Type: application/json
126
-
Authorization: Bearer {{token}}
113
+
ps 1: We have another physical resource limit for our services: vector index size. HNSW requires vector indices to reside entirely in memory. "Vector index size" is our customer-facing resource limit that governs the memory consumed by their vector data. (and this is a big reason why the beefiest VMs have 512 GB of RAM). Increasing partitions also increases the amount of vector quota for customers as well.
ps 2: A richer index has more fields and configurations, and is often better because extra fields support richer queries and more opportunities for relevance tuning. Filters and scoring profiles for boosting apply to nonvector fields. If you have content that should be matched precisely and not similarly, such as a name or employee number, then create fields to contain that information.*
146
116
147
117
## BLOCKED: Index for hybrid queries and relevance tuning
Copy file name to clipboardExpand all lines: articles/search/tutorial-rag-build-solution-models.md
+21-8Lines changed: 21 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,32 +21,45 @@ In this tutorial, you:
21
21
22
22
> [!div class="checklist"]
23
23
> - Learn which models in the Azure cloud work with built-in integration
24
+
> - Learn about the Azure models used for chat
24
25
> - Deploy models and collect model information for your code
25
26
> - Configure search engine access to Azure models
26
27
> - Learn about custom skills and vectorizers for attaching non-Azure models
27
28
29
+
If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
30
+
28
31
## Prerequisites
29
32
30
33
- Use the Azure portal to deploy models and configure role assignments in the Azure cloud.
31
34
32
-
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource) (used in this tutorial), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/).
35
+
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider imposes more permissions or requirements for deploying and accessing models.
36
+
37
+
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource) (used in this tutorial), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/). We use Azure OpenAI in this tutorial, but list the other Azure resources so that you know your options.
38
+
39
+
- Azure AI Search, Basic tier or higher provides a [managed identity](search-howto-managed-identities-data-sources.md) used in role assignments.
40
+
41
+
To complete all of the tutorials in this series, the region must also support both Azure AI Search and the model provider. See supported regions for:
-[Azure AI Vision regions](/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#region-availability)
33
46
34
-
-Azure AI Search, Basic tier or higher provides a [managed identity](search-howto-managed-identities-data-sources.md) used in role assignments. To complete all of the tutorials in this series, the region must also support the model provider (see supported regions for [Azure OpenAI](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability), [Azure AI Vision](/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#region-availability), [Azure AI Studio](/azure/ai-studio/reference/region-support)). Azure AI Search is currently facing limited availability in some regions. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
47
+
-[Azure AI Studio](/azure/ai-studio/reference/region-support)regions.
35
48
36
-
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider might impose more permissions or requirements for deploying and accessing models.
49
+
Azure AI Search is currently facing limited availability in some regions. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
37
50
38
51
> [!TIP]
39
-
> If you're creating Azure resources for this tutorial, you just need Azure AI Search and Azure OpenAI. If you want to try other model providers, the following regions provide the most overlap and have the most capacity: East US, East US2, and South Central in the Americas; France Central or Switzerland North in Europe; Australia East in Asia Pacific.
52
+
> Currently, the following regions provide the most overlap and have the most capacity: **East US**, **East US2**, and **South Central** in the Americas; **France Central** or **Switzerland North** in Europe; **Australia East** in Asia Pacific.
Azure AI Search supports an embedding action in an indexing pipeline. It also supports an embedding action at query time, converting text or image inputs into vectors for a vector search. In this step, identify an embedding model that works for your content and queries. If you're providing raw vector data and vector queries, or if your RAG solution doesn't include vector data, skip this step.
56
+
Azure AI Search supports an embedding action in an indexing pipeline. It also supports an embedding action at query time, converting text or image inputs into vectors for a vector search. In this step, identify an embedding model that works for your content and queries. If you're providing raw vector data and raw vector queries, or if your RAG solution doesn't include vector data, skip this step.
44
57
45
58
Vector queries work best when you use the same embedding model for both indexing and query input conversions. Using different embedding models for each action typically results in poor query outcomes.
46
59
47
60
To meet the same-model requirement, choose embedding models that can be referenced through *skills* during indexing and through *vectorizers* during query execution. Review [Create an indexing pipeline](tutorial-rag-build-solution-pipeline.md) for code that calls an embedding skill and a matching vectorizer.
48
61
49
-
The following embedding models have skills and vectorizer support in Azure AI Search.
62
+
Azure AI Search provides skill and vectorizer support for the following embedding models in the Azure cloud.
@@ -58,7 +71,7 @@ The following embedding models have skills and vectorizer support in Azure AI Se
58
71
59
72
<sup>2</sup> Deployed models in the model catalog are accessed over an AML endpoint. We use the existing AML skill for this connection.
60
73
61
-
You can use other models besides those listed here. For more information, see [Use non-Azure models for embeddings and chat](#use-non-azure-models-for-embeddings-and-chat) in this article.
74
+
You can use other models besides those listed here. For more information, see [Use non-Azure models for embeddings](#use-non-azure-models-for-embeddings-and-chat) in this article.
62
75
63
76
> [!NOTE]
64
77
> Inputs to an embedding models are typically chunked data. In an Azure AI Search RAG pattern, chunking is handled in the indexer pipeline, covered in [another tutorial](tutorial-rag-build-solution-pipeline.md) in this series.
@@ -98,7 +111,7 @@ You must have **Cognitive Services AI User** permissions to deploy models in Azu
98
111
99
112
## Configure search engine access to Azure models
100
113
101
-
For pipeline and query execution, this tutorial uses Micrsoft Entra ID for authentication and roles for authorization.
114
+
For pipeline and query execution, this tutorial uses Microsoft Entra ID for authentication and roles for authorization.
102
115
103
116
Assign yourself and the search service identity permissions on Azure OpenAI. The code for this tutorial runs locally. Requests to Azure OpenAI originate from your system. Also, search results from the search engine are passed to Azure OpenAI. For these reasons, both you and the search service need permissions on Azure OpenAI.
0 commit comments