Skip to content

Commit 7e38ed3

Browse files
authored
Merge pull request #264299 from HeidiSteen/heidist-docs
[azure search] RAG quickstart
2 parents 09b87bb + e362d95 commit 7e38ed3

File tree

7 files changed

+133
-1
lines changed

7 files changed

+133
-1
lines changed

articles/search/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@
1919
href: search-get-started-text.md
2020
- name: Semantic ranking
2121
href: search-get-started-semantic.md
22+
- name: Retrieval Augmented Generation (RAG)
23+
href: search-get-started-retrieval-augmented-generation.md
2224
- name: Portal
2325
items:
2426
- name: Create an index

articles/search/index.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,10 +46,12 @@ landingContent:
4646
url: vector-search-integrated-vectorization.md
4747
- text: Retrieval Augmented Generation (RAG)
4848
url: retrieval-augmented-generation-overview.md
49-
- linkListType: how-to-guide
49+
- linkListType: quickstart
5050
links:
5151
- text: Create a vector store
5252
url: search-get-started-vector.md
53+
- text: Chat with your data
54+
url: search-get-started-retrieval-augmented-generation.md
5355
- text: Query a vector store
5456
url: vector-search-how-to-query.md
5557
- linkListType: sample
39.9 KB
Loading
48.1 KB
Loading
21.8 KB
Loading
60.2 KB
Loading
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
---
2+
title: 'Quickstart: RAG app'
3+
titleSuffix: Azure AI Search
4+
description: Use Azure OpenAI Studio to chat with a search index on Azure AI Search. Explore the Retrieval Augmented Generation (RAG) pattern for your search solution.
5+
6+
author: HeidiSteen
7+
ms.author: heidist
8+
ms.service: cognitive-search
9+
ms.custom:
10+
ms.topic: quickstart
11+
ms.date: 01/25/2024
12+
---
13+
14+
# Quickstart: Chat with your search index in Azure OpenAI Studio
15+
16+
Get started with generative search using Azure OpenAI Studio's **Add your own data** option to implement a Retrieval Augmented Generation (RAG) experience powered by Azure AI Search.
17+
18+
**Add your own data** gives you built-in data preprocessing (text extraction and clean up), data chunking, embedding, and indexing. You can stand up a chat experience quickly, experiment with prompts over your own data, and gain important insights as to how your content performs before writing any code.
19+
20+
In this quickstart:
21+
22+
> [!div class="checklist"]
23+
> + Deploy Azure OpenAI models
24+
> + Download sample PDFs
25+
> + Configure data processing
26+
> + Chat with your data in the Azure OpenAI Studio playground
27+
> + Test your index with different chat models, configurations, and history
28+
29+
## Prerequisites
30+
31+
+ [An Azure subscription](https://azure.microsoft.com/free/)
32+
33+
+ [Azure OpenAI](https://aka.ms/oai/access)
34+
35+
+ [Azure Storage](/azure/storage/common/storage-account-create)
36+
37+
+ [Azure AI Search](search-create-app-portal.md), in any region, on a billable tier (Basic and above), preferably with [semantic ranking enabled](semantic-how-to-enable-disable.md)
38+
39+
+ Contributor permissions in the Azure subscription for creating resources
40+
41+
## Set up model deployments
42+
43+
1. Start [Azure OpenAI Studio](https://oai.azure.com/portal).
44+
45+
1. Sign in, select your Azure subscription and Azure OpenAI resource, and then select **Use resource**.
46+
47+
1. Under **Management > Deployments**, find or create a deployment for each of the following models:
48+
49+
+ [text-embedding-ada-002](/azure/ai-services/openai/concepts/models#embeddings)
50+
+ [gpt-35-turbo](/azure/ai-services/openai/concepts/models#gpt-35)
51+
52+
Deploy more chat models if you want to test them with your data. Note that Text-Davinci-002 isn't supported.
53+
54+
If you create new deployments, the default configurations are suited for this tutorial. It's helpful to name each deployment after the model. For example, "text-embedding-ada-002" as the deployment name of the text-embedding-ada-002 model.
55+
56+
## Generate a vector store for the playground
57+
58+
1. Download the sample famous-speeches-pdf PDFs in [azure-search-sample-data](https://github.com/Azure-Samples/azure-search-sample-data/tree/main/famous-speeches-pdf).
59+
60+
1. Sign in to the [Azure OpenAI Studio](https://oai.azure.com/portal).
61+
62+
1. On the **Chat** page under **Playground**, select **Add your data (preview)**.
63+
64+
1. Select **Add data source**.
65+
66+
1. From the dropdown list, select **Upload files**.
67+
68+
:::image type="content" source="media/search-get-started-rag/azure-openai-data-source.png" lightbox="media/search-get-started-rag/azure-openai-data-source.png" alt-text="Screenshot of the upload files option.":::
69+
70+
1. In Data source, select your Azure Blob storage resource. Enable cross-origin scripting if prompted.
71+
72+
1. Select your Azure AI Search resource.
73+
74+
1. Provide an index name that's unique in your search service.
75+
76+
1. Check **Add vector search to this search index.**
77+
78+
1. Select **Azure OpenaI - text-embedding-ada-002**.
79+
80+
1. Check the acknowledgment that Azure AI Search is a billable service. If you're using an existing search service, there's no extra charge for vector store unless you add semantic ranking. If you're creating a new service, Azure AI Search becomes billable upon service creation.
81+
82+
1. Select **Next**.
83+
84+
1. In Upload files, select the four files and then select **Upload**.
85+
86+
1. Select **Next**.
87+
88+
1. In Data Management, choose **Hybrid + semantic** if [semantic ranking is enabled](semantic-how-to-enable-disable.md) on your search service. If semantic ranking is disabled, choose **Hybrid (vector + keyword)**. Hybrid is a better choice because vector (similarity) search and keyword search execute the same query input in parallel, which can produce a more relevant response.
89+
90+
:::image type="content" source="media/search-get-started-rag/azure-openai-data-manage.png" lightbox="media/search-get-started-rag/azure-openai-data-manage.png" alt-text="Screenshot of the data management options.":::
91+
92+
1. Acknowledge that vectorization of the sample data is billed at the usage rate of the Azure OpenAI embedding model.
93+
94+
1. Select **Next**, and then select **Review and Finish**.
95+
96+
## Chat with your data
97+
98+
1. Review advanced settings that determine how much flexibility the chat model has in supplementing the grounding data, and how many chunks are returned from the query to the vector store.
99+
100+
Strictness determines whether the model supplements the query with its own information. A level of 5 is no supplementation. Only your grounding data is used, which means the search engine plays a large role in the quality of the response. Semantic ranking can be helpful in this scenario because the ranking models do a better job of interpreting the intent of the query.
101+
102+
Lower levels of strictness produce more verbose answers, but might also include information that isn't in your index.
103+
104+
:::image type="content" source="media/search-get-started-rag/azure-openai-studio-advanced-settings.png" alt-text="Screenshot of the advanced settings.":::
105+
106+
1. Start with these settings:
107+
108+
+ Check the **Limit responses to your data content** option.
109+
+ Strictness set to 3.
110+
+ Retrieved documents set to 20. Given chunk sizes of 1024 tokens, a setting of 20 gives you roughly 20,000 tokens to use for generating responses. The tradeoff is query latency, but you can experiment with chat replay to find the right balance.
111+
112+
1. Send your first query. The chat models perform best in question and answer exercises. For example, "who gave the Gettysburg speech" or "when was the Gettysburg speech delivered".
113+
114+
More complex queries, such as "why was Gettysburg important", perform better if the model has some latitude to answer (lower levels of strictness) or if semantic ranking is enabled.
115+
116+
Queries that require deeper analysis, such as "how many speeches are in the vector store", might fail to return a response. In RAG pattern chat scenarios, information retrieval is keyword and similarity search against the query string, where the search engine looks for chunks having exact or similar terms, phrases, or construction. The payload might have insufficient data for the model to work with.
117+
118+
Finally, chats are constrained by the number of documents (chunks) returned in the response (limited to 3-20 in Azure OpenAI Studio playground). As you can imagine, posing a question about "all of the titles" requires a full scan of the entire vector store, which means a different approach, or modifying the generated code to allow for [exhaustive search](vector-search-how-to-create-index.md#add-a-vector-search-configuration) in the vector search configuration.
119+
120+
:::image type="content" source="media/search-get-started-rag/chat-results.png" lightbox="media/search-get-started-rag/chat-results.png" alt-text="Screenshot of a chat session.":::
121+
122+
## Next steps
123+
124+
Now that you're familiar with the benefits of Azure OpenAI Studio for scenario testing, review code samples that demonstrate the full range of APIs for RAG applications. Samples are available in [Python](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-python), [C#](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet), and [JavaScript](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-javascript).
125+
126+
## Clean up
127+
128+
Azure AI Search is a billable resource for as long as the service exists. If it's no longer needed, delete it from your subscription to avoid charges.

0 commit comments

Comments
 (0)