Merge branch 'main' into release-preview-new-mistral-models

v-alje · v-alje · commit 5ccc3c2abd36 · 2024-12-12T23:15:39.000Z
diff --git a/articles/ai-services/openai/how-to/evaluations.md b/articles/ai-services/openai/how-to/evaluations.md
@@ -22,6 +22,7 @@ Azure OpenAI evaluation enables developers to create evaluation runs to test aga
 
 ### Regional availability
 
+- East US2
 - North Central US
 - Sweden Central
 - Switzerland West
diff --git a/articles/ai-services/openai/how-to/fine-tuning.md b/articles/ai-services/openai/how-to/fine-tuning.md
@@ -26,9 +26,6 @@ In contrast to few-shot learning, fine tuning improves the model by training on
 
 We use LoRA, or low rank approximation, to fine-tune models in a way that reduces their complexity without significantly affecting their performance. This method works by approximating the original high-rank matrix with a lower rank one, thus only fine-tuning a smaller subset of *important* parameters during the supervised training phase, making the model more manageable and efficient. For users, this makes training faster and more affordable than other techniques.
 
-> [!NOTE]
-> Azure OpenAI currently only supports text-to-text fine-tuning for all supported models including GPT-4o mini.
-
 ::: zone pivot="programming-language-studio"
 
 [!INCLUDE [Azure OpenAI Studio fine-tuning](../includes/fine-tuning-unified.md)]
diff --git a/articles/ai-services/openai/how-to/realtime-audio.md b/articles/ai-services/openai/how-to/realtime-audio.md
@@ -161,4 +161,5 @@ An example `session.update` that configures several aspects of the session, incl
 ## Related content
 
 * Try the [real-time audio quickstart](../realtime-audio-quickstart.md)
+* See the [Realtime API reference](../realtime-audio-reference.md)
 * Learn more about Azure OpenAI [quotas and limits](../quotas-limits.md)
diff --git a/articles/ai-services/openai/realtime-audio-quickstart.md b/articles/ai-services/openai/realtime-audio-quickstart.md
@@ -128,4 +128,5 @@ You can run the sample code locally on your machine by following these steps. Re
 ## Related content
 
 * Learn more about [How to use the Realtime API](./how-to/realtime-audio.md)
+* See the [Realtime API reference](./realtime-audio-reference.md)
 * Learn more about Azure OpenAI [quotas and limits](quotas-limits.md)
diff --git a/articles/ai-services/openai/realtime-audio-reference.md b/articles/ai-services/openai/realtime-audio-reference.md
diff --git a/articles/ai-services/openai/toc.yml b/articles/ai-services/openai/toc.yml
@@ -348,6 +348,8 @@ items:
               displayName: RAG, rag
     - name: Azure OpenAI monitoring data reference
       href: monitor-openai-reference.md
+    - name: Realtime API (preview) WebSocket reference
+      href: realtime-audio-reference.md
 - name: Resources
   items: 
     - name: Support and help options
diff --git a/articles/ai-studio/concepts/retrieval-augmented-generation.md b/articles/ai-studio/concepts/retrieval-augmented-generation.md
@@ -8,30 +8,30 @@ ms.custom:
   - ignite-2023
   - build-2024
 ms.topic: conceptual
-ms.date: 5/21/2024
+ms.date: 12/12/2024
 ms.reviewer: sgilley
 ms.author: sgilley
 author: sdgilley
 ---
 
 # Retrieval augmented generation and indexes
 
-This article talks about the importance and need for Retrieval Augmented Generation (RAG) and index in generative AI. 
+This article talks about the importance and need for Retrieval Augmented Generation (RAG) and index in generative AI.
 
 ## What is RAG?
 
 Some basics first. Large language models (LLMs) like ChatGPT are trained on public internet data that was available at the point in time when they were trained. They can answer questions related to the data they were trained on. This public data might not be sufficient to meet all your needs. You might want questions answered based on your private data. Or, the public data might simply have gotten out of date. The solution to this problem is Retrieval Augmented Generation (RAG), a pattern used in AI that uses an LLM to generate answers with your own data.
 
 ## How does RAG work?
 
-RAG is a pattern that uses your data with an LLM to generate answers specific to your data. When a user asks a question, the data store is searched based on user input. The user question is then combined with the matching results and sent to the LLM using a prompt (explicit instructions to an AI or machine learning model) to generate the desired answer. This can be illustrated as follows.
+RAG is a pattern that uses your data with an LLM to generate answers specific to your data. When a user asks a question, the data store is searched based on user input. The user question is then combined with the matching results and sent to the LLM using a prompt (explicit instructions to an AI or machine learning model) to generate the desired answer. This process can be illustrated as follows.
 
 :::image type="content" source="../media/index-retrieve/rag-pattern.png" alt-text="Screenshot of the RAG pattern." lightbox="../media/index-retrieve/rag-pattern.png":::
 
 
 ## What is an index and why do I need it?
 
-RAG uses your data to generate answers to the user question. For RAG to work well, we need to find a way to search and send your data in an easy and cost efficient manner to the LLMs. This is achieved by using an index. An index is a data store that allows you to search data efficiently. This is very useful in RAG. An index can be optimized for LLMs by creating vectors (text data converted to number sequences using an embedding model). A good index usually has efficient search capabilities like keyword searches, semantic searches, vector searches or a combination of these. This optimized RAG pattern can be illustrated as follows.
+RAG uses your data to generate answers to the user question. For RAG to work well, we need to find a way to search and send your data in an easy and cost efficient manner to the LLMs. This is achieved by using an index. An index is a data store that allows you to search data efficiently. This index is very useful in RAG. An index can be optimized for LLMs by creating vectors (text data converted to number sequences using an embedding model). A good index usually has efficient search capabilities like keyword searches, semantic searches, vector searches, or a combination of these. This optimized RAG pattern can be illustrated as follows.
 
 :::image type="content" source="../media/index-retrieve/rag-pattern-with-index.png" alt-text="Screenshot of the RAG pattern with index." lightbox="../media/index-retrieve/rag-pattern-with-index.png":::
 
diff --git a/articles/ai-studio/index.yml b/articles/ai-studio/index.yml
@@ -14,7 +14,7 @@ metadata:
   ms.reviewer: sgilley
   ms.author: sgilley
   author: sdgilley
-  ms.date: 5/21/2024
+  ms.date: 12/12/2024
 # linkListType: architecture | concept | deploy | download | get-started | how-to-guide | learn | overview | quickstart | reference | tutorial | video | whats-new
 
 landingContent:
@@ -72,7 +72,7 @@ landingContent:
       - linkListType: tutorial
         links:
           - text: Build a custom chat app with the Azure AI SDK
-            url: tutorials/copilot-sdk-build-rag.md
+            url: tutorials/copilot-sdk-create-resources.md
 
       - linkListType: concept
         links:
diff --git a/articles/ai-studio/toc.yml b/articles/ai-studio/toc.yml
@@ -309,8 +309,6 @@ items:
          href: how-to/develop/trace-local-sdk.md
        - name: Visualize your traces
          href: how-to/develop/visualize-traces.md
-       - name: Continuously monitor your applications
-         href: how-to/online-evaluation.md
   - name: Evaluate generative AI apps
     items:
     - name: Evaluations concepts
@@ -341,16 +339,20 @@ items:
       href: concepts/a-b-experimentation.md
   - name: Deploy and monitor generative AI apps
     items:
-    - name: Deploy a flow for real-time inference
-      href: how-to/flow-deploy.md
-      displayName: endpoint
-    - name: Enable tracing and collect feedback for a flow deployment
-      href: how-to/develop/trace-production-sdk.md
-      displayName: code
-    - name: Monitor prompt flow deployments
-      href: how-to/monitor-quality-safety.md
-    - name: Troubleshoot deployments and monitoring
-      href: how-to/troubleshoot-deploy-and-monitor.md
+    - name: Continuously monitor your applications
+      href: how-to/online-evaluation.md
+    - name: Deploy and monitor flows
+      items:
+      - name: Deploy a flow for real-time inference
+        href: how-to/flow-deploy.md
+        displayName: endpoint
+      - name: Enable tracing and collect feedback for a flow deployment
+        href: how-to/develop/trace-production-sdk.md
+        displayName: code
+      - name: Monitor prompt flow deployments
+        href: how-to/monitor-quality-safety.md
+      - name: Troubleshoot deployments and monitoring
+        href: how-to/troubleshoot-deploy-and-monitor.md
   - name: Costs and quotas
     items:
     - name: Plan and manage costs
diff --git a/articles/ai-studio/tutorials/copilot-sdk-build-rag.md b/articles/ai-studio/tutorials/copilot-sdk-build-rag.md
@@ -1,5 +1,5 @@
 ---
-title: "Part 2: Build a ca custom knowledge retrieval (RAG) app with the Azure AI Foundry SDK"
+title: "Part 2: Build a custom knowledge retrieval (RAG) app with the Azure AI Foundry SDK"
 titleSuffix: Azure AI Foundry
 description:  Learn how to build a RAG-based chat app using the Azure AI Foundry SDK. This tutorial is part 2 of a 3-part tutorial series.
 manager: scottpolly
diff --git a/articles/ai-studio/tutorials/copilot-sdk-create-resources.md b/articles/ai-studio/tutorials/copilot-sdk-create-resources.md
@@ -46,7 +46,7 @@ To create a project in [Azure AI Foundry](https://ai.azure.com), follow these st
 1. Go to the **Home** page of [Azure AI Foundry](https://ai.azure.com).
 1. Select **+ Create project**.
 1. Enter a name for the project.  Keep all the other settings as default.
-1. Projects are created in hubs.  For this tutorial, create a new hub. If you see **Create a new hub** select it and specify a name.  Then select **Next**. (If you don't see **Create new hub**, it's because a new one is being created for you.) 
+1. Projects are created in hubs.  If you see **Create a new hub** select it and specify a name.  Then select **Next**. (If you don't see **Create new hub**, don't worry; it's because a new one is being created for you.) 
 1. Select **Customize** to specify properties of the hub.
 1. Use any values you want, except for **Region**.  We recommend you use either **East US2** or **Sweden Central** for the region for this tutorial series.
 1. Select **Next**.
diff --git a/articles/machine-learning/includes/prereq-workspace.md b/articles/machine-learning/includes/prereq-workspace.md
@@ -5,9 +5,12 @@ author: sdgilley
 ms.service: azure-machine-learning
 services: machine-learning
 ms.topic: include
-ms.date: 03/22/2023
+ms.date: 12/12/2024
 ms.author: sgilley
 ms.custom: include file
 ---
 
 To use Azure Machine Learning, you need a workspace. If you don't have one, complete [Create resources you need to get started](../quickstart-create-resources.md) to create a workspace and learn more about using it.
+
+> [!IMPORTANT]
+> If your Azure Machine Learning workspace is configured with a managed virtual network, you may need to add outbound rules to allow access to the public Python package repositories. For more information, see [Scenario: Access public machine learning packages](/azure/machine-learning/how-to-managed-network#scenario-access-public-machine-learning-packages).