Skip to content

Commit 91bca61

Browse files
committed
checkpoint
1 parent 886201f commit 91bca61

File tree

2 files changed

+39
-56
lines changed

2 files changed

+39
-56
lines changed

articles/search/tutorial-rag-build-solution-index-schema.md

Lines changed: 18 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ In this tutorial, you:
2121
> [!div class="checklist"]
2222
> - Learn the characteristics of an index schema built for RAG
2323
> - Create an index that accommodate vectors and hybrid queries
24-
> - Add semantic ranking and filters
2524
> - Add vector profiles and configurations
2625
> - Add structured data
26+
> - Add filters
2727
2828
## Prerequisites
2929

@@ -37,25 +37,31 @@ In conversational search, LLMs compose the response that the user sees, not the
3737

3838
To generate a response, LLMs operate on chunks of content, and while they need to know where the chunk came from for citation purposes, what matters most is the quality of message inputs and its relevance to the user's question. Whether the chunks come from one document or a thousand, the LLM ingests the information or *grounding data*, and formulates the response using instructions provided in a system prompt.
3939

40-
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search, and nonvector fields for human-readable results. Nonvector "chunked" content in the search results becomes the grounding data sent to the LLM.
40+
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search for highly relevant results, and nonvector fields for human-readable inputs to the LLM for conversational search. Nonvector "chunked" content in the search results becomes the grounding data sent to the LLM.
4141

42-
A schema that works best for RAG workloads:
42+
An index that works best for RAG workloads has these qualities:
4343

44-
- Returns chunks of content in search results.
45-
- Search results go to an LLM, which can handle a certain level of redundancy and incomplete strings.
46-
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors.
47-
- Your schema should be flat (no complex types or structures).
48-
- You can only query one index at a time (no joins), but you can create indexes that preserve parent-child relationship, and then use nested queries or parallel queries in your search logic to pull from both. This exercise includes templates for both.
44+
- Returns chunks that are relevant to the query and readable to the LLM.
4945

50-
Schema design has an impact on storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-optimize.md) tutorial, we revisit schema design to consider narrow data types, attribution, and vector configurations that are more efficient.
46+
LLMs can handle a certain level of dirty data in chunked data, such as mark up, redundancy, and incomplete strings. While chunks need to be readable, they don't need to be pristine.
47+
48+
- Maintains a relationship between chunks of a document and properties of the parent document. For example, file name, file type, title, author, and so forth. Chunks in search results could be pulled from anywhere in the index. Association with the parent document that provides the chunk is useful for context, citations, and follow up queries.
49+
50+
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your field collection should define all of your searchable content.
51+
52+
- Your schema should be flat (no complex types or structures). This requirement is specific to the RAG pattern in Azure AI Search.
53+
54+
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use nested queries or parallel queries in your search logic to pull from both. This exercise includes templates for both parent-child elements in the same index, or parent-child elements in separate indexes that are connected at query time through sequential queries to different indexes.
55+
56+
Schema design affects storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-optimize.md) tutorial, we revisit schema design to consider narrow data types, attribution, and vector configurations that are more efficient.
5157

5258
## Create a basic index
5359

5460
Here's a minimal index definition for RAG solutions that supports vector and hybrid search. If you aren't using vectors, the index can be even simpler (see [Quickstart: Generative search (RAG)](search-get-started-rag.md) for an example).
5561

5662
1. Open Visual Studio Code and create a new file. It doesn't have to be a Python file type for this exercise.
5763

58-
1. Review the following example for an introduction to required elements. These elements consist of a name, key field ("id"), non-vector chunks for the LLM, vector chunks for similarity search by the search engine, and a `vectorSearch` configuration for the vector fields.
64+
1. Review the following example for an introduction to required elements. In a RAG pattern, elements consist of a name, key field (`"id"`), non-vector chunks for the LLM, vector chunks for similarity search by the search engine, and a `vectorSearch` configuration for the vector fields.
5965

6066
Vector fields have [specific types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields) and extra attributes for embedding model dimensions and configuration. `Edm.Single` is a data type that works for the more commonly used LLMs. For more information about vector fields, see [Create a vector index](vector-search-how-to-create-index.md).
6167

@@ -79,7 +85,6 @@ Here's a minimal index definition for RAG solutions that supports vector and hyb
7985
}
8086
```
8187

82-
1.
8388
<!-- Objective:
8489

8590
- Design an index schema that generates results in a format that works for LLMs.
@@ -105,44 +110,9 @@ Tasks:
105110

106111
<!--
107112

108-
ps: We have another physical resource limit for our services: vector index size. HNSW requires vector indices to reside entirely in memory. "Vector index size" is our customer-facing resource limit that governs the memory consumed by their vector data. (and this is a big reason why the beefiest VMs have 512 GB of RAM). Increasing partitions also increases the amount of vector quota for customers as well.
109-
110-
111-
## Old introduction
112-
113-
114-
`sidenote: the following applies to the non-basic index, which might be out of scope`.
115-
*A richer index has more fields and configurations, and is often better because extra fields support richer queries and more opportunities for relevance tuning. Filters and scoring profiles for boosting apply to nonvector fields. If you have content that should be matched precisely and not similarly, such as a name or employee number, then create fields to contain that information.*
116-
117-
118-
## Create a basic index
119-
120-
1. Create an index definition with required elements. The index requires a key field ("id"). It includes vector and nonvector chunks of text. Vector content is used for similarity search. Nonvector content is returned in results and will be passed in messages to the LLM. The vector search configuration defines the algorithms used for a vector query.
121-
122-
```http
123-
### Create an index for RAG scenarios
124-
POST {{baseUrl}}/indexes?api-version=2024-05-01-preview HTTP/1.1
125-
Content-Type: application/json
126-
Authorization: Bearer {{token}}
113+
ps 1: We have another physical resource limit for our services: vector index size. HNSW requires vector indices to reside entirely in memory. "Vector index size" is our customer-facing resource limit that governs the memory consumed by their vector data. (and this is a big reason why the beefiest VMs have 512 GB of RAM). Increasing partitions also increases the amount of vector quota for customers as well.
127114

128-
{
129-
"name": "demo-rag-index",
130-
"fields": [
131-
{ "name": "id", "type": "Edm.String", "key": true },
132-
{ "name": "chunked_content", "type": "Edm.String", "searchable": true, "retrievable": true },
133-
{ "name": "chunked_content_vectorized", "type": "Edm.Single", "dimensions": 1536, "vectorSearchProfile": "my-vector-profile", "searchable": true, "retrievable": false, "stored": false },
134-
{ "name": "metadata", "type": "Edm.String", "retrievable": true, "searchable": true }
135-
],
136-
"vectorSearch": {
137-
"algorithms": [
138-
{ "name": "my-algo-config", "kind": "hnsw", "hnswParameters": { } }
139-
],
140-
"profiles": [
141-
{ "name": "my-vector-profile", "algorithm": "my-algo-config" }
142-
]
143-
}
144-
}
145-
```
115+
ps 2: A richer index has more fields and configurations, and is often better because extra fields support richer queries and more opportunities for relevance tuning. Filters and scoring profiles for boosting apply to nonvector fields. If you have content that should be matched precisely and not similarly, such as a name or employee number, then create fields to contain that information.*
146116

147117
## BLOCKED: Index for hybrid queries and relevance tuning
148118

articles/search/tutorial-rag-build-solution-models.md

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,32 +21,45 @@ In this tutorial, you:
2121

2222
> [!div class="checklist"]
2323
> - Learn which models in the Azure cloud work with built-in integration
24+
> - Learn about the Azure models used for chat
2425
> - Deploy models and collect model information for your code
2526
> - Configure search engine access to Azure models
2627
> - Learn about custom skills and vectorizers for attaching non-Azure models
2728
29+
If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
30+
2831
## Prerequisites
2932

3033
- Use the Azure portal to deploy models and configure role assignments in the Azure cloud.
3134

32-
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource) (used in this tutorial), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/).
35+
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider imposes more permissions or requirements for deploying and accessing models.
36+
37+
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource) (used in this tutorial), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/). We use Azure OpenAI in this tutorial, but list the other Azure resources so that you know your options.
38+
39+
- Azure AI Search, Basic tier or higher provides a [managed identity](search-howto-managed-identities-data-sources.md) used in role assignments.
40+
41+
To complete all of the tutorials in this series, the region must also support both Azure AI Search and the model provider. See supported regions for:
42+
43+
- [Azure OpenAI regions](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability)
44+
45+
- [Azure AI Vision regions](/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#region-availability)
3346

34-
- Azure AI Search, Basic tier or higher provides a [managed identity](search-howto-managed-identities-data-sources.md) used in role assignments. To complete all of the tutorials in this series, the region must also support the model provider (see supported regions for [Azure OpenAI](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability), [Azure AI Vision](/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#region-availability), [Azure AI Studio](/azure/ai-studio/reference/region-support)). Azure AI Search is currently facing limited availability in some regions. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
47+
- [Azure AI Studio](/azure/ai-studio/reference/region-support) regions.
3548

36-
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider might impose more permissions or requirements for deploying and accessing models.
49+
Azure AI Search is currently facing limited availability in some regions. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
3750

3851
> [!TIP]
39-
> If you're creating Azure resources for this tutorial, you just need Azure AI Search and Azure OpenAI. If you want to try other model providers, the following regions provide the most overlap and have the most capacity: East US, East US2, and South Central in the Americas; France Central or Switzerland North in Europe; Australia East in Asia Pacific.
52+
> Currently, the following regions provide the most overlap and have the most capacity: **East US**, **East US2**, and **South Central** in the Americas; **France Central** or **Switzerland North** in Europe; **Australia East** in Asia Pacific.
4053
4154
## Review models supporting built-in vectorization
4255

43-
Azure AI Search supports an embedding action in an indexing pipeline. It also supports an embedding action at query time, converting text or image inputs into vectors for a vector search. In this step, identify an embedding model that works for your content and queries. If you're providing raw vector data and vector queries, or if your RAG solution doesn't include vector data, skip this step.
56+
Azure AI Search supports an embedding action in an indexing pipeline. It also supports an embedding action at query time, converting text or image inputs into vectors for a vector search. In this step, identify an embedding model that works for your content and queries. If you're providing raw vector data and raw vector queries, or if your RAG solution doesn't include vector data, skip this step.
4457

4558
Vector queries work best when you use the same embedding model for both indexing and query input conversions. Using different embedding models for each action typically results in poor query outcomes.
4659

4760
To meet the same-model requirement, choose embedding models that can be referenced through *skills* during indexing and through *vectorizers* during query execution. Review [Create an indexing pipeline](tutorial-rag-build-solution-pipeline.md) for code that calls an embedding skill and a matching vectorizer.
4861

49-
The following embedding models have skills and vectorizer support in Azure AI Search.
62+
Azure AI Search provides skill and vectorizer support for the following embedding models in the Azure cloud.
5063

5164
| Client | Embedding models | Skill | Vectorizer |
5265
|--------|------------------|-------|------------|
@@ -58,7 +71,7 @@ The following embedding models have skills and vectorizer support in Azure AI Se
5871

5972
<sup>2</sup> Deployed models in the model catalog are accessed over an AML endpoint. We use the existing AML skill for this connection.
6073

61-
You can use other models besides those listed here. For more information, see [Use non-Azure models for embeddings and chat](#use-non-azure-models-for-embeddings-and-chat) in this article.
74+
You can use other models besides those listed here. For more information, see [Use non-Azure models for embeddings](#use-non-azure-models-for-embeddings-and-chat) in this article.
6275

6376
> [!NOTE]
6477
> Inputs to an embedding models are typically chunked data. In an Azure AI Search RAG pattern, chunking is handled in the indexer pipeline, covered in [another tutorial](tutorial-rag-build-solution-pipeline.md) in this series.
@@ -98,7 +111,7 @@ You must have **Cognitive Services AI User** permissions to deploy models in Azu
98111

99112
## Configure search engine access to Azure models
100113

101-
For pipeline and query execution, this tutorial uses Micrsoft Entra ID for authentication and roles for authorization.
114+
For pipeline and query execution, this tutorial uses Microsoft Entra ID for authentication and roles for authorization.
102115

103116
Assign yourself and the search service identity permissions on Azure OpenAI. The code for this tutorial runs locally. Requests to Azure OpenAI originate from your system. Also, search results from the search engine are passed to Azure OpenAI. For these reasons, both you and the search service need permissions on Azure OpenAI.
104117

0 commit comments

Comments
 (0)