Skip to content

Commit dd6d3e1

Browse files
authored
Merge pull request #657 from HeidiSteen/heidist-rag
[azure search} RAG tutorial updates per MattG updates
2 parents 7603dd7 + 0b3b5bc commit dd6d3e1

6 files changed

+241
-101
lines changed

articles/search/tutorial-rag-build-solution-index-schema.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: tutorial
11-
ms.date: 10/01/2024
11+
ms.date: 10/04/2024
1212

1313
---
1414

@@ -43,7 +43,7 @@ Chunks are the focus of the schema, and each chunk is the defining element of a
4343

4444
### Enhanced with generated data
4545

46-
In this tutorial, sample data consists of PDFs and content from the [NASA Earth Book](https://www.nasa.gov/ebooks/earth/). This content is descriptive and informative, with numerous references to geographies, countries, and areas across the world. All of the textual content is captured in chunks, but these recurring instances of place names create an opportunity for adding structure to the index. Using skills, it's possible to recognize entities in the text and capture them in an index for use in queries and filters. In this tutorial, we include an [entity recognition skill](cognitive-search-skill-entity-recognition-v3.md) that recognizes and extracts location entities, loading it into a searchable and filterable `locations` field. Adding structured content to your index gives you more options for filtering, improved relevance, and more focused answers.
46+
In this tutorial, sample data consists of PDFs and content from the [NASA Earth Book](https://www.nasa.gov/ebooks/earth/). This content is descriptive and informative, with numerous references to geographies, countries, and areas across the world. All of the textual content is captured in chunks, but recurring instances of place names create an opportunity for adding structure to the index. Using skills, it's possible to recognize entities in the text and capture them in an index for use in queries and filters. In this tutorial, we include an [entity recognition skill](cognitive-search-skill-entity-recognition-v3.md) that recognizes and extracts location entities, loading it into a searchable and filterable `locations` field. Adding structured content to your index gives you more options for filtering, improved relevance, and more focused answers.
4747

4848
### Parent-child fields in one or two indexes?
4949

@@ -61,11 +61,11 @@ In Azure AI Search, an index that works best for RAG workloads has these qualiti
6161

6262
- Maintains a parent-child relationship between chunks of a document and the properties of the parent document, such as the file name, file type, title, author, and so forth. To answer a query, chunks could be pulled from anywhere in the index. Association with the parent document providing the chunk is useful for context, citations, and follow up queries.
6363

64-
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors. You can only query one index at a time (no joins) so your fields collection should define all of your searchable content.
64+
- Accommodates the queries you want create. You should have fields for vector and hybrid content, and those fields should be attributed to support specific query behaviors, such as searchable or filterable. You can only query one index at a time (no joins) so your fields collection should define all of your searchable content.
6565

6666
- Your schema should be flat (no complex types or structures). This requirement is specific to the RAG pattern in Azure AI Search.
6767

68-
Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use sequential or parallel queries in your search logic to pull from both. This exercise includes templates for parent-child elements in the same index and in separate indexes, where information from the parent index is retrieved using a lookup query.
68+
<!-- Although Azure AI Search can't join indexes, you can create indexes that preserve parent-child relationship, and then use sequential queries in your search logic to pull from both (a query on the chunked data index, a lookup on the parent index). This exercise includes templates for parent-child elements in the same index and in separate indexes, where information from the parent index is retrieved using a lookup query. -->
6969

7070
<!-- > [!NOTE]
7171
> Schema design affects storage and costs. This exercise is focused on schema fundamentals. In the [Minimize storage and costs](tutorial-rag-build-solution-minimize-storage.md) tutorial, you revisit schema design to consider narrow data types, attribution, and vector configurations that offer more efficient. -->
@@ -136,7 +136,7 @@ A minimal index for LLM is designed to store chunks of content. It typically inc
136136
SearchField(name="locations", type=SearchFieldDataType.Collection(SearchFieldDataType.String), filterable=True),
137137
SearchField(name="chunk_id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True, analyzer_name="keyword"),
138138
SearchField(name="chunk", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
139-
SearchField(name="text_vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile")
139+
SearchField(name="text_vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), vector_search_dimensions=1024, vector_search_profile_name="myHnswProfile")
140140
]
141141

142142
# Configure the vector search configuration
@@ -157,8 +157,8 @@ A minimal index for LLM is designed to store chunks of content. It typically inc
157157
kind="azureOpenAI",
158158
parameters=AzureOpenAIVectorizerParameters(
159159
resource_url=AZURE_OPENAI_ACCOUNT,
160-
deployment_name="text-embedding-ada-002",
161-
model_name="text-embedding-ada-002"
160+
deployment_name="text-embedding-3-large",
161+
model_name="text-embedding-3-large"
162162
),
163163
),
164164
],
@@ -170,7 +170,7 @@ A minimal index for LLM is designed to store chunks of content. It typically inc
170170
print(f"{result.name} created")
171171
```
172172

173-
1. For an index schema that more closely mimics structured content, you would have separate indexes for parent and child (chunked) fields. You would need index projections to coordinate the indexing of the two indexes simultaneously. Queries execute against the child index. Query logic includes a lookup query, using the parent_idto retrieve content from the parent index.
173+
1. For an index schema that more closely mimics structured content, you would have separate indexes for parent and child (chunked) fields. You would need [index projections](index-projections-concept-intro.md) to coordinate the indexing of the two indexes simultaneously. Queries execute against the child index. Query logic includes a lookup query, using the parent_idt retrieve content from the parent index.
174174

175175
Fields in the child index:
176176

articles/search/tutorial-rag-build-solution-maximize-relevance.md

Lines changed: 93 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ In this tutorial, you modify the existing search index and queries to use:
2727
This tutorial updates the search index created by the [indexing pipeline](tutorial-rag-build-solution-pipeline.md). Updates don't affect the existing content, so no rebuild is necessary and you don't need to rerun the indexer.
2828

2929
> [!NOTE]
30-
> There are more relevance features in preview, including vector query weighting and setting minimum thresholds, but we omit them from this tutorial becaues they aren't yet available in the Azure SDK for Python.
30+
> There are more relevance features in preview, including vector query weighting and setting minimum thresholds, but we omit them from this tutorial because they're in preview.
3131
3232
## Prerequisites
3333

@@ -41,9 +41,78 @@ This tutorial updates the search index created by the [indexing pipeline](tutori
4141

4242
The [sample notebook](https://github.com/Azure-Samples/azure-search-python-samples/blob/main/Tutorial-RAG/Tutorial-rag.ipynb) includes an updated index and query request.
4343

44+
## Run a baseline query for comparison
45+
46+
Let's start with a new query, "Are there any cloud formations specific to oceans and large bodies of water?".
47+
48+
To compare outcomes after adding relevance features, run the query against the existing index schema, before you add semantic ranking or a scoring profile.
49+
50+
```python
51+
from azure.search.documents import SearchClient
52+
from openai import AzureOpenAI
53+
54+
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")
55+
openai_client = AzureOpenAI(
56+
api_version="2024-06-01",
57+
azure_endpoint=AZURE_OPENAI_ACCOUNT,
58+
azure_ad_token_provider=token_provider
59+
)
60+
61+
deployment_name = "gpt-4o"
62+
63+
search_client = SearchClient(
64+
endpoint=AZURE_SEARCH_SERVICE,
65+
index_name=index_name,
66+
credential=credential
67+
)
68+
69+
GROUNDED_PROMPT="""
70+
You are an AI assistant that helps users learn from the information found in the source material.
71+
Answer the query using only the sources provided below.
72+
Use bullets if the answer has multiple points.
73+
If the answer is longer than 3 sentences, provide a summary.
74+
Answer ONLY with the facts listed in the list of sources below. Cite your source when you answer the question
75+
If there isn't enough information below, say you don't know.
76+
Do not generate answers that don't use the sources below.
77+
Query: {query}
78+
Sources:\n{sources}
79+
"""
80+
81+
# Focused query on cloud formations and bodies of water
82+
query="Are there any cloud formations specific to oceans and large bodies of water?"
83+
vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields="text_vector")
84+
85+
search_results = search_client.search(
86+
search_text=query,
87+
vector_queries= [vector_query],
88+
select=["title", "chunk", "locations"],
89+
top=5,
90+
)
91+
92+
sources_formatted = "=================\n".join([f'TITLE: {document["title"]}, CONTENT: {document["chunk"]}, LOCATIONS: {document["locations"]}' for document in search_results])
93+
94+
response = openai_client.chat.completions.create(
95+
messages=[
96+
{
97+
"role": "user",
98+
"content": GROUNDED_PROMPT.format(query=query, sources=sources_formatted)
99+
}
100+
],
101+
model=deployment_name
102+
)
103+
104+
print(response.choices[0].message.content)
105+
```
106+
107+
Output from this request might look like the following example.
108+
109+
```
110+
Yes, there are cloud formations specific to oceans and large bodies of water. A notable example is "cloud streets," which are parallel rows of clouds that form over the Bering Strait in the Arctic Ocean. These cloud streets occur when wind blows from a cold surface like sea ice over warmer, moister air near the open ocean, leading to the formation of spinning air cylinders. Clouds form along the upward cycle of these cylinders, while skies remain clear along the downward cycle (Source: page-21.pdf).
111+
```
112+
44113
## Update the index for semantic ranking and scoring profiles
45114

46-
In a previous tutorial, you [designed an index schema](tutorial-rag-build-solution-index-schema.md) for RAG workloads. We purposely omitted relevance enhancements from that schema so that you could focus on the fundamentals. Deferring relevance to a separate exercise also gives you a before-and-after comparison of the quality of search results after the updates are made.
115+
In a previous tutorial, you [designed an index schema](tutorial-rag-build-solution-index-schema.md) for RAG workloads. We purposely omitted relevance enhancements from that schema so that you could focus on the fundamentals. Deferring relevance to a separate exercise gives you a before-and-after comparison of the quality of search results after the updates are made.
47116

48117
1. Update the import statements to include classes for semantic ranking and scoring profiles.
49118

@@ -138,7 +207,7 @@ openai_client = AzureOpenAI(
138207
azure_ad_token_provider=token_provider
139208
)
140209

141-
deployment_name = "gpt-35-turbo"
210+
deployment_name = "gpt-4o"
142211

143212
search_client = SearchClient(
144213
endpoint=AZURE_SEARCH_SERVICE,
@@ -160,8 +229,8 @@ Sources:\n{sources}
160229
"""
161230

162231
# Queries are unchanged in this update
163-
query="how much of earth is covered by water"
164-
vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=1, fields="text_vector", exhaustive=True)
232+
query="Are there any cloud formations specific to oceans and large bodies of water?"
233+
vector_query = VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields="text_vector")
165234

166235
# Add query_type semantic and semantic_configuration_name
167236
# Add scoring_profile and scoring_parameters
@@ -175,7 +244,7 @@ search_results = search_client.search(
175244
select="title, chunk, locations",
176245
top=5,
177246
)
178-
sources_formatted = "\n".join([f'{document["title"]}:{document["chunk"]}:{document["locations"]}' for document in search_results])
247+
sources_formatted = "=================\n".join([f'TITLE: {document["title"]}, CONTENT: {document["chunk"]}, LOCATIONS: {document["locations"]}' for document in search_results])
179248

180249
response = openai_client.chat.completions.create(
181250
messages=[
@@ -190,6 +259,24 @@ response = openai_client.chat.completions.create(
190259
print(response.choices[0].message.content)
191260
```
192261

262+
Output from a semantically ranked and boosted query might look like the following example.
263+
264+
```
265+
Yes, there are specific cloud formations influenced by oceans and large bodies of water:
266+
267+
- **Stratus Clouds Over Icebergs**: Low stratus clouds can frame holes over icebergs, such as Iceberg A-56 in the South Atlantic Ocean, likely due to thermal instability caused by the iceberg (source: page-39.pdf).
268+
269+
- **Undular Bores**: These are wave structures in the atmosphere created by the collision of cool, dry air from a continent with warm, moist air over the ocean, as seen off the coast of Mauritania (source: page-23.pdf).
270+
271+
- **Ship Tracks**: These are narrow clouds formed by water vapor condensing around tiny particles from ship exhaust. They are observed over the oceans, such as in the Pacific Ocean off the coast of California (source: page-31.pdf).
272+
273+
These specific formations are influenced by unique interactions between atmospheric conditions and the presence of large water bodies or objects within them.
274+
```
275+
276+
Adding semantic ranking and scoring profiles positively affects the response from the LLM by promoting results that meet scoring criteria and are semantically relevant.
277+
278+
Now that you have a better understanding of index and query design, let's move on to optimizing for speed and concision. We revisit the schema definition to implement quantization and storage reduction, but the rest of the pipeline and models remain intact.
279+
193280
<!-- ## Update queries for minimum thresholds ** NOT AVAILABLE IN PYTHON SDK
194281
195282
Keyword search only returns results if there's match found in the index, up to a maximum of 50 results by default. In contrast, vector search returns `k`-results every time, even if the matching vectors aren't a close match.

0 commit comments

Comments
 (0)