Skip to content

Commit 1888160

Browse files
committed
checkpoint
1 parent 91bca61 commit 1888160

5 files changed

+48
-25
lines changed

articles/search/toc.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -103,9 +103,9 @@ items:
103103
- name: Search and generate answers
104104
href: tutorial-rag-build-solution-query.md
105105
- name: Maximize relevance
106-
href: tutorial-rag-build-solution-relevance.md
107-
- name: Optimize (reduce storage and costs)
108-
href: tutorial-rag-build-solution-optimize.md
106+
href: tutorial-rag-build-solution-maximize-relevance.md
107+
- name: Minimize storage and costs
108+
href: tutorial-rag-build-solution-minimize-storage.md
109109
- name: Deploy and secure an app
110110
href: tutorial-rag-build-solution-app.md
111111
- name: Skills tutorials

articles/search/tutorial-rag-build-solution-index-schema.md

Lines changed: 34 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -27,47 +27,59 @@ In this tutorial, you:
2727
2828
## Prerequisites
2929

30-
- [Visual Studio Code](https://code.visualstudio.com/download) with the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) and the [Jupyter package](https://pypi.org/project/jupyter/). For more information, see [Python in Visual Studio Code](https://code.visualstudio.com/docs/languages/python).
30+
[Visual Studio Code](https://code.visualstudio.com/download) with the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) and the [Jupyter package](https://pypi.org/project/jupyter/). For more information, see [Python in Visual Studio Code](https://code.visualstudio.com/docs/languages/python).
3131

3232
The output of this exercise is an index definition in JSON. At this point, it's not uploaded to Azure AI Search, so there are no requirements for cloud services or permissions in this exercise.
3333

3434
## Review schema considerations for RAG
3535

3636
In conversational search, LLMs compose the response that the user sees, not the search engine, so you don't need to think about what fields to show in your search results, and whether the representations of individual search documents are coherent to the user. Depending on the question, the LLM might return verbatim content from your index, or more likely, repackage the content for a better answer.
3737

38-
To generate a response, LLMs operate on chunks of content, and while they need to know where the chunk came from for citation purposes, what matters most is the quality of message inputs and its relevance to the user's question. Whether the chunks come from one document or a thousand, the LLM ingests the information or *grounding data*, and formulates the response using instructions provided in a system prompt.
38+
### Focus on chunks
3939

40-
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search for highly relevant results, and nonvector fields for human-readable inputs to the LLM for conversational search. Nonvector "chunked" content in the search results becomes the grounding data sent to the LLM.
40+
To generate a response, LLMs operate on chunks of content, and while they need to know where the chunk came from for citation purposes, what matters most is the quality of message inputs and its relevance to the user's question. Whether the chunks come from one document or a thousand, the LLM ingests the information or *grounding data*, and formulates the response using instructions provided in a system prompt.
4141

42-
An index that works best for RAG workloads has these qualities:
42+
Chunks are the focus of the schema, and each chunk is the definitive element of a search document in a RAG pattern. You can think of your index as a large collection of chunks, as opposed to traditional search documents that have more structure and fields containing uniform content for a name field, versus a description field, versus a category field.
43+
44+
A minimal index for LLM is designed to store chunks of content. It includes vector fields if you want similarity search for highly relevant results, and nonvector fields for human-readable inputs to the LLM for conversational search. Nonvector chunked content in the search results becomes the grounding data sent to the LLM.
4345

44-
- Returns chunks that are relevant to the query and readable to the LLM.
46+
### Checklist of schema considerations
4547

46-
LLMs can handle a certain level of dirty data in chunked data, such as mark up, redundancy, and incomplete strings. While chunks need to be readable, they don't need to be pristine.
48+
An index that works best for RAG workloads has these qualities:
4749

48-
- Maintains a relationship between chunks of a document and properties of the parent document. For example, file name, file type, title, author, and so forth. Chunks in search results could be pulled from anywhere in the index. Association with the parent document that provides the chunk is useful for context, citations, and follow up queries.
50+
- Returns chunks that are relevant to the query and readable to the LLM. LLMs can handle a certain level of dirty data in chunks, such as mark up, redundancy, and incomplete strings. While chunks need to be readable and relevant to the query, they don't need to be pristine.
4951

50-
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your field collection should define all of your searchable content.
52+
- Maintains a parent-child relationship between chunks of a document and the properties of the parent document, such as the file name, file type, title, author, and so forth. To answer a query, chunks could be pulled from anywhere in the index. Association with the parent document providing the chunk is useful for context, citations, and follow up queries.
53+
54+
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your fields collection should define all of your searchable content.
5155

5256
- Your schema should be flat (no complex types or structures). This requirement is specific to the RAG pattern in Azure AI Search.
5357

54-
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use nested queries or parallel queries in your search logic to pull from both. This exercise includes templates for both parent-child elements in the same index, or parent-child elements in separate indexes that are connected at query time through sequential queries to different indexes.
58+
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use sequential or parallel queries in your search logic to pull from both. This exercise includes templates for parent-child elements in the same index and in separate indexes, where information from the parent index is retrieved using a look up query.
5559

5660
Schema design affects storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-optimize.md) tutorial, we revisit schema design to consider narrow data types, attribution, and vector configurations that are more efficient.
5761

58-
## Create a basic index
62+
### Sample content for this tutorial
5963

60-
Here's a minimal index definition for RAG solutions that supports vector and hybrid search. If you aren't using vectors, the index can be even simpler (see [Quickstart: Generative search (RAG)](search-get-started-rag.md) for an example).
64+
The content you're indexing informs what fields are in the index.
6165

62-
1. Open Visual Studio Code and create a new file. It doesn't have to be a Python file type for this exercise.
66+
In this tutorial, we use PDFs and content from the NASA Earth at Night ebook. The original ebook is large, over 100 pages and 35 MB in size. We broke it up into smaller PDFs, one per page of text, to stay under the REST API payload limit of 16 MB per API call.
6367

64-
1. Review the following example for an introduction to required elements. In a RAG pattern, elements consist of a name, key field (`"id"`), non-vector chunks for the LLM, vector chunks for similarity search by the search engine, and a `vectorSearch` configuration for the vector fields.
68+
We omit image vectorization for this exercise.
6569

66-
Vector fields have [specific types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields) and extra attributes for embedding model dimensions and configuration. `Edm.Single` is a data type that works for the more commonly used LLMs. For more information about vector fields, see [Create a vector index](vector-search-how-to-create-index.md).
70+
The sample content is descriptive and informative. It also mentions places, regions, and countries across the world. We can include skills in our indexing pipeline that extracts this information and loads it into a queryable and filterable `locations` field.
71+
72+
Because all of the chunks of text originate from the same parent (Earth at Night ebook), we don't need a separate index dedicated to parent fields. If we were indexing from multiple parent PDFs, we would want a parent-child index pair to capture PDF-specific fields (path, title, authors, publication date, summary) and then send look up queries to the parent index to retrieve those fields relevant to each chunk. We include an example of that parent-child index template in this exercise for comparison.
73+
74+
## Create a basic index
75+
76+
1. Open Visual Studio Code and create a new file. It doesn't have to be a Python file type for this exercise.
77+
78+
1. Here's a minimal index definition for RAG solutions that support vector and hybrid search. Review it for an introduction to required elements: name, fields, and a `vectorSearch` configuration for the vector fields.
6779

6880
```json
6981
{
70-
"name": "my-demo-index",
82+
"name": "example-minimal-index",
7183
"fields": [
7284
{ "name": "id", "type": "Edm.String", "key": true },
7385
{ "name": "chunked_content", "type": "Edm.String", "searchable": true, "retrievable": true },
@@ -85,6 +97,13 @@ Here's a minimal index definition for RAG solutions that supports vector and hyb
8597
}
8698
```
8799

100+
Fields must include key field (`"id"`) and should include vector chunks for similarity search, and nonvector chunks for the LLM. Metadata about the source file might be file path, creation date, or content type.
101+
102+
Vector fields have [specific types](/rest/api/searchservice/supported-data-types#edm-data-types-for-vector-fields) and extra attributes for embedding model dimensions and configuration. `Edm.Single` is a data type that works for the more commonly used LLMs. For more information about vector fields, see [Create a vector index](vector-search-how-to-create-index.md).
103+
104+
1. Here's the index schema for the tutorial and the NASA ebook content. It's similar to the basic schema, but adds a parent ID and metadata. It also includes fields for storing generated content that's created in the indexing pipeline.
105+
106+
88107
<!-- Objective:
89108

90109
- Design an index schema that generates results in a format that works for LLMs.
File renamed without changes.
File renamed without changes.

articles/search/tutorial-rag-build-solution-models.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,23 +30,25 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
3030

3131
## Prerequisites
3232

33-
- Use the Azure portal to deploy models and configure role assignments in the Azure cloud.
33+
- The Azure portal, used to deploy models and configure role assignments in the Azure cloud.
3434

35-
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider imposes more permissions or requirements for deploying and accessing models.
35+
- An **Owner** role on your Azure subscription is necessary for creating role assignments. Your model provider has more role requirements for deploying and accessing models. Those are noted in the following steps.
3636

37-
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource) (used in this tutorial), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/). We use Azure OpenAI in this tutorial, but list the other Azure resources so that you know your options.
37+
- A model provider, such as [Azure OpenAI](/azure/ai-services/openai/how-to/create-resource), Azure AI Vision created using [a multi-service account](/azure/ai-services/multi-service-resource), or [Azure AI Studio](https://ai.azure.com/).
38+
39+
We use Azure OpenAI in this tutorial, but list the other Azure resources so that you know your options for integrated embedding.
3840

3941
- Azure AI Search, Basic tier or higher provides a [managed identity](search-howto-managed-identities-data-sources.md) used in role assignments.
4042

41-
To complete all of the tutorials in this series, the region must also support both Azure AI Search and the model provider. See supported regions for:
43+
To complete all of the tutorials in this series, the region must support both Azure AI Search and the model provider. See supported regions for:
4244

4345
- [Azure OpenAI regions](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability)
4446

4547
- [Azure AI Vision regions](/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#region-availability)
4648

4749
- [Azure AI Studio](/azure/ai-studio/reference/region-support) regions.
4850

49-
Azure AI Search is currently facing limited availability in some regions. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
51+
Azure AI Search is currently facing limited availability in some regions, such as West Europe and West US 2/3. Check the [Azure AI Search region list](search-region-support.md) to confirm availability.
5052

5153
> [!TIP]
5254
> Currently, the following regions provide the most overlap and have the most capacity: **East US**, **East US2**, and **South Central** in the Americas; **France Central** or **Switzerland North** in Europe; **Australia East** in Asia Pacific.
@@ -97,7 +99,7 @@ This tutorial series uses the following models and model providers:
9799
- Text-embedding-ada-02 on Azure OpenAI for embeddings
98100
- GPT-35-Turbo on Azure OpenAI for chat completion
99101

100-
You must have **Cognitive Services AI User** permissions to deploy models in Azure OpenAI.
102+
You must have [**Cognitive Services OpenAI Contributor**]( /azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-contributor) or higher to deploy models in Azure OpenAI.
101103

102104
1. Go to [Azure OpenAI Studio](https://oai.azure.com/).
103105
1. Select **Deployments** on the left menu.
@@ -120,11 +122,13 @@ Assign yourself and the search service identity permissions on Azure OpenAI. The
120122
1. Find your Azure OpenAI resource.
121123
1. Select **Access control (IAM)** on the left menu.
122124
1. Select **Add role assignment**.
123-
1. Select **Cognitive Services OpenAI User**.
125+
1. Select [**Cognitive Services OpenAI User**](/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-userpermissions).
124126
1. Select **Managed identity** and then select **Members**. Find the system-managed identity for your search service in the dropdown list.
125127
1. Next, select **User, group, or service principal** and then select **Members**. Search for your user account and then select it from the dropdown list.
126128
1. Select **Review and Assign** to create the role assignments.
127129

130+
For access to models on Azure AI Vision, assign **Cognitive Services OpenAI User**. For Azure AI Studio, assign **Azure AI Developer**.
131+
128132
## Use non-Azure models for embeddings
129133

130134
The pattern for integrating any embedding model is to wrap it in a custom skill and custom vectorizer. This section provides links to reference articles. For a code example that calls a non-Azure model, see [custom-embeddings demo](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-python/code/custom-vectorizer/readme.md).

0 commit comments

Comments
 (0)