Merge pull request #266773 from aahill/rag-article

v-dirichards · web-flow · commit 43fac0c5711d · 2024-03-07T16:37:21.000-06:00
merging RAG article
diff --git a/articles/ai-services/openai/concepts/use-your-data.md b/articles/ai-services/openai/concepts/use-your-data.md
@@ -271,9 +271,9 @@ You can modify the following additional settings in the **Data parameters** sect
 
 |Parameter name  | Description  |
 |---------|---------|
-| **Limit responses to your data** | This flag configures the chatbot's approach to handling queries unrelated to the data source or when search documents are insufficient for a complete answer. When this setting is disabled, the model supplements its responses with its own knowledge in addition to your documents. When this setting is enabled, the model attempts to only rely on your documents for responses. This is the `inScope` parameter in the API. |
-|**Retrieved documents**     |  This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information might be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. This is the `topNDocuments` parameter in the API. |
-| **Strictness**     | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API. |
+| **Limit responses to your data** | This flag configures the chatbot's approach to handling queries unrelated to the data source or when search documents are insufficient for a complete answer. When this setting is disabled, the model supplements its responses with its own knowledge in addition to your documents. When this setting is enabled, the model attempts to only rely on your documents for responses. This is the `inScope` parameter in the API, and set to true by default. |
+|**Retrieved documents**     |  This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information might be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. This is the `topNDocuments` parameter in the API, and is 5 by default. |
+| **Strictness**     | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API, and set to 3 by default. |
 
 ### System message
 
@@ -422,7 +422,7 @@ When you chat with a model, providing a history of the chat will help the model
 
 Azure OpenAI On Your Data Retrieval Augmented Generation (RAG) service that leverages both a search service (such as Azure AI Search) and generation (Azure OpenAI models) to let users get answers for their questions based on provided data. 
 
-As part of this RAG pipeline, there are are three steps at a high-level: 
+As part of this RAG pipeline, there are three steps at a high-level: 
 
 1. Reformulate the user query into a list of search intents. This is done by making a call to the model with a prompt that includes instructions, the user question, and conversation history. Let's call this an *intent prompt*. 
 
diff --git a/articles/ai-services/openai/includes/connect-your-data-studio.md b/articles/ai-services/openai/includes/connect-your-data-studio.md
@@ -18,7 +18,7 @@ Navigate to [Azure OpenAI Studio](https://oai.azure.com/) and sign-in with crede
 
     :::image type="content" source="../media/use-your-data/bring-your-data-card.png" alt-text="A screenshot of the Azure OpenAI Studio landing page." lightbox="../media/use-your-data/bring-your-data-card.png":::
 
-1. In the pane that appears, select **Upload files (preview)** under **Select data source**. Azure OpenAI needs both a storage resource and a search resource to access and index your data.
+1. In the pane that appears, select **Upload files (preview)** under **Select data source**. Azure OpenAI needs both a storage resource and a search resource to access and index your data. 
 
     > [!TIP]
     > * See the following resource for more information:
@@ -33,7 +33,7 @@ Navigate to [Azure OpenAI Studio](https://oai.azure.com/) and sign-in with crede
     :::image type="content" source="../media/quickstarts/add-your-data-source.png" alt-text="A screenshot showing options for selecting a data source in Azure OpenAI Studio." lightbox="../media/quickstarts/add-your-data-source.png":::
 
 
-1. On the **Upload files** pane, select **Browse for a file** and select the files you want to upload. Then select **Upload files**. Then select **Next**.
+1. On the **Upload files** pane, select **Browse for a file** and select the files you downloaded from the [prerequisites](#prerequisites) section, or your own data. Then select **Upload files**. Then select **Next**.
 
 1. On the **Data management** pane, you can choose whether to enable [semantic search or vector search](../concepts/use-your-data.md#search-types) for your index.
     
diff --git a/articles/ai-services/openai/includes/use-your-data-dotnet.md b/articles/ai-services/openai/includes/use-your-data-dotnet.md
@@ -36,7 +36,7 @@ var chatCompletionsOptions = new ChatCompletionsOptions()
 {
     Messages =
     {
-        new ChatRequestUserMessage("What are the differences between Azure Machine Learning and Azure AI services?"),
+        new ChatRequestUserMessage("What are my available health plans?"),
     },
     AzureExtensionsOptions = new AzureChatExtensionsOptions()
     {
@@ -94,7 +94,7 @@ dotnet run program.cs
 ```output
 Answer from assistant:
 ===
-Azure Machine Learning is a cloud-based service that provides tools and services to build, train, and deploy machine learning models. It offers a collaborative environment for data scientists, developers, and domain experts to work together on machine learning projects. Azure Machine Learning supports various programming languages, frameworks, and libraries, including Python, R, TensorFlow, and PyTorch [^1^].
+The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans [^1^].
 ===
 Context information (e.g. citations) from chat extensions:
 ===
@@ -113,7 +113,7 @@ tool: {
     },
     ...
   ],
-  "intent": "[\u0022What are the differences between Azure Machine Learning and Azure AI services?\u0022]"
+  "intent": "[\u0022What are my available health plans?\u0022]"
 }
 ===
 ```
@@ -143,7 +143,7 @@ var chatCompletionsOptions = new ChatCompletionsOptions()
     DeploymentName = deploymentName,
     Messages =
     {
-        new ChatRequestUserMessage("What are the differences between Azure Machine Learning and Azure AI services?"),
+        new ChatRequestUserMessage("What are my available health plans?"),
     },
     AzureExtensionsOptions = new AzureChatExtensionsOptions()
     {
diff --git a/articles/ai-services/openai/includes/use-your-data-go.md b/articles/ai-services/openai/includes/use-your-data-go.md
@@ -76,7 +76,7 @@ ms.date: 03/07/2024
    
    	resp, err := client.GetChatCompletions(context.TODO(), azopenai.ChatCompletionsOptions{
    		Messages: []azopenai.ChatRequestMessageClassification{
-   			&azopenai.ChatRequestUserMessage{Content: azopenai.NewChatRequestUserMessageContent("What are the differences between Azure Machine Learning and Azure AI services?")},
+   			&azopenai.ChatRequestUserMessage{Content: azopenai.NewChatRequestUserMessageContent("What are my available health plans?")},
    		},
    		MaxTokens: to.Ptr[int32](512),
    		AzureExtensionsOptions: []azopenai.AzureChatExtensionConfigurationClassification{
diff --git a/articles/ai-services/openai/includes/use-your-data-javascript.md b/articles/ai-services/openai/includes/use-your-data-javascript.md
@@ -51,7 +51,7 @@ async function main(){
   const client = new OpenAIClient(endpoint, new AzureKeyCredential(azureApiKey));
 
   const messages = [
-    { role: "user", content: "Tell me something interesting" },
+    { role: "user", content: "What are my available health plans?" },
   ];
 
   console.log(`Message: ${messages.map((m) => m.content).join("\n")}`);
@@ -102,8 +102,8 @@ node.exe ChatWithOwnData.js
 ## Output
 
 ```output
-Message: Tell me something interesting
-Based on the retrieved document, an interesting fact is...
+Message: What are my available health plans?
+The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans.
 
 ```
 
diff --git a/articles/ai-services/openai/includes/use-your-data-powershell.md b/articles/ai-services/openai/includes/use-your-data-powershell.md
@@ -49,7 +49,7 @@ To trigger a response from the model, you should end with a user message indicat
     messages = @(
             @{
                 role = 'user'
-                content = 'How do you query REST using PowerShell'
+                content = 'What are my available health plans?'
             }
     )
    } | convertto-json -depth 5
@@ -69,7 +69,7 @@ To trigger a response from the model, you should end with a user message indicat
 ### Example output
 
 ```text
-To query a RESTful web service using PowerShell, you can use the `Invoke-RestMethod` cmdlet. This cmdlet sends HTTP and HTTPS requests to RESTful web services and processes the response based on the data type.
+The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans.
 ```
 
 > [!IMPORTANT]
diff --git a/articles/ai-services/openai/includes/use-your-data-python.md b/articles/ai-services/openai/includes/use-your-data-python.md
@@ -86,7 +86,7 @@ pip install python-dotenv
    setup_byod(aoai_deployment_id)
 
    completion = openai.ChatCompletion.create(
-       messages=[{"role": "user", "content": "What are the differences between Azure Machine Learning and Azure AI services?"}],
+       messages=[{"role": "user", "content": "What are my available health plans?"}],
        deployment_id=os.environ.get("AZURE_OPENAI_DEPLOYMENT_ID"),
        dataSources=[  # camelCase is intentional, as this is the format the API expects
            {
@@ -126,7 +126,7 @@ completion = client.chat.completions.create(
     messages=[
         {
             "role": "user",
-            "content": "How is Azure machine learning different than Azure OpenAI?",
+            "content": "What are my available health plans?",
         },
     ],
     extra_body={
diff --git a/articles/ai-services/openai/includes/use-your-data-rest.md b/articles/ai-services/openai/includes/use-your-data-rest.md
@@ -39,7 +39,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME
     "messages": [
         {
             "role": "user",
-            "content": "Tell me an interesting fact"
+            "content": "What are my available health plans?"
         }
     ]
 }
@@ -60,7 +60,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME
             "finish_reason": "stop",
             "message": {
                 "role": "assistant",
-                "content": "An interesting fact from the retrieved document is... [doc1].",
+                "content": "The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans. [doc1].",
                 "end_turn": true,
                 "context": {
                     "citations": [
@@ -72,7 +72,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME
                             "chunk_id": "0"
                         }
                     ],
-                    "intent": "[\"Interesting facts\"]"
+                    "intent": "[\"Available health plans\"]"
                 }
             }
         }
diff --git a/articles/ai-services/openai/includes/use-your-data-spring.md b/articles/ai-services/openai/includes/use-your-data-spring.md
@@ -211,7 +211,7 @@ ai-custom-data-demo/
 
            final var store = vectorStore(embeddingClient);
 
-           final String question = "What are the differences between Azure Machine Learning and Azure AI services?";
+           final String question = "What are my available health plans?";
 
            final var candidateDocs = store.similaritySearch(question);
 
@@ -386,11 +386,7 @@ ai-custom-data-demo/
 Sending custom data prompt to AI service. One moment please...
 
 Prompt created 1 generated response(s).
-Generated response from "assistant": Azure Machine Learning is a cloud-based service that allows users to build, deploy, and manage machine learning models. It provides a range of tools and capabilities for data scientists and developers to train models, automate the machine learning workflow, and deploy models as web services.
-
-On the other hand, Azure AI services is a broader category that includes various services and technologies for artificial intelligence. It encompasses not only machine learning but also other AI capabilities such as natural language processing, computer vision, speech recognition, and more. Azure AI services provide pre-built AI models and APIs that developers can easily integrate into their applications.
-
-In summary, Azure Machine Learning is specifically focused on machine learning model development and deployment, while Azure AI services offer a wider range of AI capabilities beyond just machine learning.
+Generated response from "assistant": The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans.
 ```
 
 > [!div class="nextstepaction"]
diff --git a/articles/ai-services/openai/includes/use-your-data-studio.md b/articles/ai-services/openai/includes/use-your-data-studio.md
@@ -18,7 +18,25 @@ Start exploring Azure OpenAI capabilities with a no-code approach through the ch
 
 :::image type="content" source="../media/quickstarts/chat-playground.png" alt-text="Screenshot of the playground page of the Azure OpenAI Studio with sections highlighted." lightbox="../media/quickstarts/chat-playground.png":::
 
-You can experiment with the configuration settings such as temperature and pre-response text to improve the performance of your task. You can read more about each parameter in the [REST API](../reference.md).
+The playground gives you options to tailor your chat experience. On the right, you can select **Deployment** to determine which model generates a response using the search results from your index. You choose the number of past messages to include as conversation history for future generated responses. [Conversation history](../concepts/use-your-data.md#conversation-history-for-better-results) gives context to generate related responses but also consumes [token usage](../concepts/use-your-data.md#token-usage-estimation-for-azure-openai-on-your-data). The input token progress indicator keeps track of the token count of the question you submit. 
+
+
+The **Advanced settings** on the left are [runtime parameters](../concepts/use-your-data.md#runtime-parameters), which give you control over retrieval and search relevant information from your data. A good use case is when you want to make sure responses are generated only based on your data or you find the model cannot generate a response based on existed information on your data.
+
+- **Strictness** determines the system's aggressiveness in filtering search documents based on their similarity scores. Setting strictness to 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. [Semantic search](../concepts/use-your-data.md#search-types) can be helpful in this scenario because the ranking models do a better job of inferring the intent of the query. Lower levels of strictness produce more verbose answers, but might also include information that isn't in your index. This is set to 3 by default.
+
+- **Retrieved documents** is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5.
+
+
+- When **Limit responses to your data** is enabled, the model attempts to only rely on your documents for responses. This is set to true by default.
+
+:::image type="content" source="../media/quickstarts/studio-advanced-settings.png" alt-text="Screenshot of the advanced settings.":::
+
+Send your first query. The chat models perform best in question and answer exercises. For example, "*What are my available health plans?*" or "*What is the health plus option?*".
+
+Queries that require data analysis would probably fail, such as "*Which health plan is most popular?*". Queries that require information about all of your data will also likely fail, such as "*How many documents have I uploaded?*". Remember that the search engine looks for chunks having exact or similar terms, phrases, or construction to the query. And while the model might understand the question, if search results are chunks from the data set, it's not the right information to answer that kind of question.
+
+Chats are constrained by the number of documents (chunks) returned in the response (limited to 3-20 in Azure OpenAI Studio playground). As you can imagine, posing a question about "all of the titles" requires a full scan of the entire vector store.
 
 [!INCLUDE [deploy-web-app](deploy-web-app.md)]
 
diff --git a/articles/ai-services/openai/media/quickstarts/studio-advanced-settings.png b/articles/ai-services/openai/media/quickstarts/studio-advanced-settings.png
diff --git a/articles/ai-services/openai/use-your-data-quickstart.md b/articles/ai-services/openai/use-your-data-quickstart.md
@@ -55,6 +55,8 @@ In this quickstart you can use your own data with Azure OpenAI models. Using Azu
 
 - Be sure that you are assigned at least the [Cognitive Services Contributor](./how-to/role-based-access-control.md#cognitive-services-contributor) role for the Azure OpenAI resource.
 
+- Download the example data from [GitHub](https://github.com/Azure-Samples/cognitive-services-sample-data-files/blob/master/openai/contoso_benefits_document_example.pdf) if you don't have your own data.
+
 ::: zone pivot="programming-language-javascript"
 
 - [LTS versions of Node.js](https://github.com/nodejs/release#release-schedule)

Original file line number	Diff line number	Diff line change
`@@ -36,7 +36,7 @@ var chatCompletionsOptions = new ChatCompletionsOptions()`
`36`	`36`	`{`
`37`	`37`	`Messages =`
`38`	`38`	`{`
`39`		`- new ChatRequestUserMessage("What are the differences between Azure Machine Learning and Azure AI services?"),`
	`39`	`+ new ChatRequestUserMessage("What are my available health plans?"),`
`40`	`40`	`},`
`41`	`41`	`AzureExtensionsOptions = new AzureChatExtensionsOptions()`
`42`	`42`	`{`
`@@ -94,7 +94,7 @@ dotnet run program.cs`
`94`	`94`	```output
`95`	`95`	`Answer from assistant:`
`96`	`96`	`===`
`97`		`-Azure Machine Learning is a cloud-based service that provides tools and services to build, train, and deploy machine learning models. It offers a collaborative environment for data scientists, developers, and domain experts to work together on machine learning projects. Azure Machine Learning supports various programming languages, frameworks, and libraries, including Python, R, TensorFlow, and PyTorch [^1^].`
	`97`	`+The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans [^1^].`
`98`	`98`	`===`
`99`	`99`	`Context information (e.g. citations) from chat extensions:`
`100`	`100`	`===`
`@@ -113,7 +113,7 @@ tool: {`
`113`	`113`	`},`
`114`	`114`	`...`
`115`	`115`	`],`
`116`		`- "intent": "[\u0022What are the differences between Azure Machine Learning and Azure AI services?\u0022]"`
	`116`	`+ "intent": "[\u0022What are my available health plans?\u0022]"`
`117`	`117`	`}`
`118`	`118`	`===`
`119`	`119`	```
`@@ -143,7 +143,7 @@ var chatCompletionsOptions = new ChatCompletionsOptions()`
`143`	`143`	`DeploymentName = deploymentName,`
`144`	`144`	`Messages =`
`145`	`145`	`{`
`146`		`- new ChatRequestUserMessage("What are the differences between Azure Machine Learning and Azure AI services?"),`
	`146`	`+ new ChatRequestUserMessage("What are my available health plans?"),`
`147`	`147`	`},`
`148`	`148`	`AzureExtensionsOptions = new AzureChatExtensionsOptions()`
`149`	`149`	`{`
Original file line number	Diff line number	Diff line change
`@@ -39,7 +39,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME`
`39`	`39`	`"messages": [`
`40`	`40`	`{`
`41`	`41`	`"role": "user",`
`42`		`- "content": "Tell me an interesting fact"`
	`42`	`+ "content": "What are my available health plans?"`
`43`	`43`	`}`
`44`	`44`	`]`
`45`	`45`	`}`
`@@ -60,7 +60,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME`
`60`	`60`	`"finish_reason": "stop",`
`61`	`61`	`"message": {`
`62`	`62`	`"role": "assistant",`
`63`		`- "content": "An interesting fact from the retrieved document is... [doc1].",`
	`63`	`+ "content": "The available health plans in the Contoso Electronics plan and benefit packages are the Northwind Health Plus and Northwind Standard plans. [doc1].",`
`64`	`64`	`"end_turn": true,`
`65`	`65`	`"context": {`
`66`	`66`	`"citations": [`
`@@ -72,7 +72,7 @@ curl -i -X POST $AZURE_OPENAI_ENDPOINT/openai/deployments/$AZURE_OPENAI_DEPLOYME`
`72`	`72`	`"chunk_id": "0"`
`73`	`73`	`}`
`74`	`74`	`],`
`75`		`- "intent": "[\"Interesting facts\"]"`
	`75`	`+ "intent": "[\"Available health plans\"]"`
`76`	`76`	`}`
`77`	`77`	`}`
`78`	`78`	`}`