Skip to content

Commit 43fac0c

Browse files
authored
Merge pull request #266773 from aahill/rag-article
merging RAG article
2 parents 9f716ad + 5822f58 commit 43fac0c

12 files changed

+44
-28
lines changed

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -271,9 +271,9 @@ You can modify the following additional settings in the **Data parameters** sect
271271

272272
|Parameter name | Description |
273273
|---------|---------|
274-
| **Limit responses to your data** | This flag configures the chatbot's approach to handling queries unrelated to the data source or when search documents are insufficient for a complete answer. When this setting is disabled, the model supplements its responses with its own knowledge in addition to your documents. When this setting is enabled, the model attempts to only rely on your documents for responses. This is the `inScope` parameter in the API. |
275-
|**Retrieved documents** | This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information might be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. This is the `topNDocuments` parameter in the API. |
276-
| **Strictness** | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API. |
274+
| **Limit responses to your data** | This flag configures the chatbot's approach to handling queries unrelated to the data source or when search documents are insufficient for a complete answer. When this setting is disabled, the model supplements its responses with its own knowledge in addition to your documents. When this setting is enabled, the model attempts to only rely on your documents for responses. This is the `inScope` parameter in the API, and set to true by default. |
275+
|**Retrieved documents** | This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information might be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. This is the `topNDocuments` parameter in the API, and is 5 by default. |
276+
| **Strictness** | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API, and set to 3 by default. |
277277

278278
### System message
279279

@@ -422,7 +422,7 @@ When you chat with a model, providing a history of the chat will help the model
422422

423423
Azure OpenAI On Your Data Retrieval Augmented Generation (RAG) service that leverages both a search service (such as Azure AI Search) and generation (Azure OpenAI models) to let users get answers for their questions based on provided data.
424424

425-
As part of this RAG pipeline, there are are three steps at a high-level:
425+
As part of this RAG pipeline, there are three steps at a high-level:
426426

427427
1. Reformulate the user query into a list of search intents. This is done by making a call to the model with a prompt that includes instructions, the user question, and conversation history. Let's call this an *intent prompt*.
428428

articles/ai-services/openai/includes/connect-your-data-studio.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Navigate to [Azure OpenAI Studio](https://oai.azure.com/) and sign-in with crede
1818

1919
:::image type="content" source="../media/use-your-data/bring-your-data-card.png" alt-text="A screenshot of the Azure OpenAI Studio landing page." lightbox="../media/use-your-data/bring-your-data-card.png":::
2020

21-
1. In the pane that appears, select **Upload files (preview)** under **Select data source**. Azure OpenAI needs both a storage resource and a search resource to access and index your data.
21+
1. In the pane that appears, select **Upload files (preview)** under **Select data source**. Azure OpenAI needs both a storage resource and a search resource to access and index your data.
2222

2323
> [!TIP]
2424
> * See the following resource for more information:
@@ -33,7 +33,7 @@ Navigate to [Azure OpenAI Studio](https://oai.azure.com/) and sign-in with crede
3333
:::image type="content" source="../media/quickstarts/add-your-data-source.png" alt-text="A screenshot showing options for selecting a data source in Azure OpenAI Studio." lightbox="../media/quickstarts/add-your-data-source.png":::
3434

3535

36-
1. On the **Upload files** pane, select **Browse for a file** and select the files you want to upload. Then select **Upload files**. Then select **Next**.
36+
1. On the **Upload files** pane, select **Browse for a file** and select the files you downloaded from the [prerequisites](#prerequisites) section, or your own data. Then select **Upload files**. Then select **Next**.
3737

3838
1. On the **Data management** pane, you can choose whether to enable [semantic search or vector search](../concepts/use-your-data.md#search-types) for your index.
3939

articles/ai-services/openai/includes/use-your-data-dotnet.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ var chatCompletionsOptions = new ChatCompletionsOptions()
3636
{
3737
Messages =
3838
{
39-
new ChatRequestUserMessage("What are the differences between Azure Machine Learning and Azure AI services?"),
39+
new ChatRequestUserMessage("What are my available health plans?"),
4040
},
4141
AzureExtensionsOptions = new AzureChatExtensionsOptions()
4242
{
@@ -94,7 +94,7 @@ dotnet run program.cs
9494
```output
9595
Answer from assistant:
9696
===
97-
Azure Machine Learning is a cloud-based service that provides tools and services to build, train, and deploy machine learning models. It offers a collaborative environment for data scientists, developers, and domain experts to work together on machine learning projects. Azure Machine Learning supports various programming languages, frameworks, and libraries, including Python, R, TensorFlow, and PyTorch [^1^].
97+
The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans [^1^].
9898
===
9999
Context information (e.g. citations) from chat extensions:
100100
===
@@ -113,7 +113,7 @@ tool: {
113113
},
114114
...
115115
],
116-
"intent": "[\u0022What are the differences between Azure Machine Learning and Azure AI services?\u0022]"
116+
"intent": "[\u0022What are my available health plans?\u0022]"
117117
}
118118
===
119119
```
@@ -143,7 +143,7 @@ var chatCompletionsOptions = new ChatCompletionsOptions()
143143
DeploymentName = deploymentName,
144144
Messages =
145145
{
146-
new ChatRequestUserMessage("What are the differences between Azure Machine Learning and Azure AI services?"),
146+
new ChatRequestUserMessage("What are my available health plans?"),
147147
},
148148
AzureExtensionsOptions = new AzureChatExtensionsOptions()
149149
{

articles/ai-services/openai/includes/use-your-data-go.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ ms.date: 03/07/2024
7676
7777
resp, err := client.GetChatCompletions(context.TODO(), azopenai.ChatCompletionsOptions{
7878
Messages: []azopenai.ChatRequestMessageClassification{
79-
&azopenai.ChatRequestUserMessage{Content: azopenai.NewChatRequestUserMessageContent("What are the differences between Azure Machine Learning and Azure AI services?")},
79+
&azopenai.ChatRequestUserMessage{Content: azopenai.NewChatRequestUserMessageContent("What are my available health plans?")},
8080
},
8181
MaxTokens: to.Ptr[int32](512),
8282
AzureExtensionsOptions: []azopenai.AzureChatExtensionConfigurationClassification{

articles/ai-services/openai/includes/use-your-data-javascript.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ async function main(){
5151
const client = new OpenAIClient(endpoint, new AzureKeyCredential(azureApiKey));
5252

5353
const messages = [
54-
{ role: "user", content: "Tell me something interesting" },
54+
{ role: "user", content: "What are my available health plans?" },
5555
];
5656

5757
console.log(`Message: ${messages.map((m) => m.content).join("\n")}`);
@@ -102,8 +102,8 @@ node.exe ChatWithOwnData.js
102102
## Output
103103
104104
```output
105-
Message: Tell me something interesting
106-
Based on the retrieved document, an interesting fact is...
105+
Message: What are my available health plans?
106+
The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans.
107107

108108
```
109109

articles/ai-services/openai/includes/use-your-data-powershell.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ To trigger a response from the model, you should end with a user message indicat
4949
messages = @(
5050
@{
5151
role = 'user'
52-
content = 'How do you query REST using PowerShell'
52+
content = 'What are my available health plans?'
5353
}
5454
)
5555
} | convertto-json -depth 5
@@ -69,7 +69,7 @@ To trigger a response from the model, you should end with a user message indicat
6969
### Example output
7070

7171
```text
72-
To query a RESTful web service using PowerShell, you can use the `Invoke-RestMethod` cmdlet. This cmdlet sends HTTP and HTTPS requests to RESTful web services and processes the response based on the data type.
72+
The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans.
7373
```
7474

7575
> [!IMPORTANT]

articles/ai-services/openai/includes/use-your-data-python.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ pip install python-dotenv
8686
setup_byod(aoai_deployment_id)
8787

8888
completion = openai.ChatCompletion.create(
89-
messages=[{"role": "user", "content": "What are the differences between Azure Machine Learning and Azure AI services?"}],
89+
messages=[{"role": "user", "content": "What are my available health plans?"}],
9090
deployment_id=os.environ.get("AZURE_OPENAI_DEPLOYMENT_ID"),
9191
dataSources=[ # camelCase is intentional, as this is the format the API expects
9292
{
@@ -126,7 +126,7 @@ completion = client.chat.completions.create(
126126
messages=[
127127
{
128128
"role": "user",
129-
"content": "How is Azure machine learning different than Azure OpenAI?",
129+
"content": "What are my available health plans?",
130130
},
131131
],
132132
extra_body={

articles/ai-services/openai/includes/use-your-data-rest.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME
3939
"messages": [
4040
{
4141
"role": "user",
42-
"content": "Tell me an interesting fact"
42+
"content": "What are my available health plans?"
4343
}
4444
]
4545
}
@@ -60,7 +60,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME
6060
"finish_reason": "stop",
6161
"message": {
6262
"role": "assistant",
63-
"content": "An interesting fact from the retrieved document is... [doc1].",
63+
"content": "The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans. [doc1].",
6464
"end_turn": true,
6565
"context": {
6666
"citations": [
@@ -72,7 +72,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME
7272
"chunk_id": "0"
7373
}
7474
],
75-
"intent": "[\"Interesting facts\"]"
75+
"intent": "[\"Available health plans\"]"
7676
}
7777
}
7878
}

articles/ai-services/openai/includes/use-your-data-spring.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,7 @@ ai-custom-data-demo/
211211

212212
final var store = vectorStore(embeddingClient);
213213

214-
final String question = "What are the differences between Azure Machine Learning and Azure AI services?";
214+
final String question = "What are my available health plans?";
215215

216216
final var candidateDocs = store.similaritySearch(question);
217217

@@ -386,11 +386,7 @@ ai-custom-data-demo/
386386
Sending custom data prompt to AI service. One moment please...
387387
388388
Prompt created 1 generated response(s).
389-
Generated response from "assistant": Azure Machine Learning is a cloud-based service that allows users to build, deploy, and manage machine learning models. It provides a range of tools and capabilities for data scientists and developers to train models, automate the machine learning workflow, and deploy models as web services.
390-
391-
On the other hand, Azure AI services is a broader category that includes various services and technologies for artificial intelligence. It encompasses not only machine learning but also other AI capabilities such as natural language processing, computer vision, speech recognition, and more. Azure AI services provide pre-built AI models and APIs that developers can easily integrate into their applications.
392-
393-
In summary, Azure Machine Learning is specifically focused on machine learning model development and deployment, while Azure AI services offer a wider range of AI capabilities beyond just machine learning.
389+
Generated response from "assistant": The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans.
394390
```
395391

396392
> [!div class="nextstepaction"]

articles/ai-services/openai/includes/use-your-data-studio.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,25 @@ Start exploring Azure OpenAI capabilities with a no-code approach through the ch
1818

1919
:::image type="content" source="../media/quickstarts/chat-playground.png" alt-text="Screenshot of the playground page of the Azure OpenAI Studio with sections highlighted." lightbox="../media/quickstarts/chat-playground.png":::
2020

21-
You can experiment with the configuration settings such as temperature and pre-response text to improve the performance of your task. You can read more about each parameter in the [REST API](../reference.md).
21+
The playground gives you options to tailor your chat experience. On the right, you can select **Deployment** to determine which model generates a response using the search results from your index. You choose the number of past messages to include as conversation history for future generated responses. [Conversation history](../concepts/use-your-data.md#conversation-history-for-better-results) gives context to generate related responses but also consumes [token usage](../concepts/use-your-data.md#token-usage-estimation-for-azure-openai-on-your-data). The input token progress indicator keeps track of the token count of the question you submit.
22+
23+
24+
The **Advanced settings** on the left are [runtime parameters](../concepts/use-your-data.md#runtime-parameters), which give you control over retrieval and search relevant information from your data. A good use case is when you want to make sure responses are generated only based on your data or you find the model cannot generate a response based on existed information on your data.
25+
26+
- **Strictness** determines the system's aggressiveness in filtering search documents based on their similarity scores. Setting strictness to 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. [Semantic search](../concepts/use-your-data.md#search-types) can be helpful in this scenario because the ranking models do a better job of inferring the intent of the query. Lower levels of strictness produce more verbose answers, but might also include information that isn't in your index. This is set to 3 by default.
27+
28+
- **Retrieved documents** is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5.
29+
30+
31+
- When **Limit responses to your data** is enabled, the model attempts to only rely on your documents for responses. This is set to true by default.
32+
33+
:::image type="content" source="../media/quickstarts/studio-advanced-settings.png" alt-text="Screenshot of the advanced settings.":::
34+
35+
Send your first query. The chat models perform best in question and answer exercises. For example, "*What are my available health plans?*" or "*What is the health plus option?*".
36+
37+
Queries that require data analysis would probably fail, such as "*Which health plan is most popular?*". Queries that require information about all of your data will also likely fail, such as "*How many documents have I uploaded?*". Remember that the search engine looks for chunks having exact or similar terms, phrases, or construction to the query. And while the model might understand the question, if search results are chunks from the data set, it's not the right information to answer that kind of question.
38+
39+
Chats are constrained by the number of documents (chunks) returned in the response (limited to 3-20 in Azure OpenAI Studio playground). As you can imagine, posing a question about "all of the titles" requires a full scan of the entire vector store.
2240

2341
[!INCLUDE [deploy-web-app](deploy-web-app.md)]
2442

0 commit comments

Comments
 (0)