MicrosoftDocs
diff --git a/‎articles/ai-services/openai/how-to/integrate-synapseml.md
Lines changed: 76 additions & 7 deletions b/‎articles/ai-services/openai/how-to/integrate-synapseml.md
Lines changed: 76 additions & 7 deletions
diff --git a/‎articles/ai-services/speech-service/includes/language-support/pronunciation-assessment.md
Lines changed: 1 addition & 0 deletions b/‎articles/ai-services/speech-service/includes/language-support/pronunciation-assessment.md
Lines changed: 1 addition & 0 deletions
diff --git a/‎articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md
Lines changed: 1 addition & 25 deletions b/‎articles/ai-services/speech-service/includes/release-notes/release-notes-stt.md
Lines changed: 1 addition & 25 deletions
diff --git a/‎articles/ai-services/speech-service/language-support.md
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/speech-service/language-support.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/backup/backup-support-matrix-iaas.md
Lines changed: 2 additions & 2 deletions b/‎articles/backup/backup-support-matrix-iaas.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/communication-services/concepts/interop/teams-user-calling.md
Lines changed: 1 addition & 1 deletion b/‎articles/communication-services/concepts/interop/teams-user-calling.md
Lines changed: 1 addition & 1 deletion
@@ -35,9 +35,9 @@ This tutorial shows how to apply large language models at a distributed scale by
    - To install SynapseML for your Apache Spark cluster, see [Install SynapseML](#install-synapseml).
 
 > [!NOTE]
-> This article is designed to work with the [Azure OpenAI Service legacy models](/azure/ai-services/openai/concepts/legacy-models) like `Text-Davinci-003`, which support prompt-based completions. Newer models like the current `GPT-3.5 Turbo` and `GPT-4` model series are designed to work with the new chat completion API that expects a specially formatted array of messages as input. 
+> The `OpenAICompletion()` transformer is designed to work with the [Azure OpenAI Service legacy models](/azure/ai-services/openai/concepts/legacy-models) like `Text-Davinci-003`, which supports prompt-based completions. Newer models like the current `GPT-3.5 Turbo` and `GPT-4` model series are designed to work with the new chat completion API that expects a specially formatted array of messages as input. If you working with embeddings or chat completion models, please check the [Chat Completion](#chat-completion) and [Generating Text Embeddings](#generating-text-embeddings) sections bellow.
 > 
-> The Azure OpenAI SynapseML integration supports the latest models via the [OpenAIChatCompletion()](https://github.com/microsoft/SynapseML/blob/0836e40efd9c48424e91aa10c8aa3fbf0de39f31/cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/openai/OpenAIChatCompletion.scala#L24) transformer, which isn't demonstrated in this article. After the [release of the GPT-3.5 Turbo Instruct model](https://techcommunity.microsoft.com/t5/azure-ai-services-blog/announcing-updates-to-azure-openai-service-models/ba-p/3866757), the newer model will be the preferred model to use with this article.
+> The Azure OpenAI SynapseML integration supports the latest models via the [OpenAIChatCompletion()](https://github.com/microsoft/SynapseML/blob/0836e40efd9c48424e91aa10c8aa3fbf0de39f31/cognitive/src/main/scala/com/microsoft/azure/synapse/ml/cognitive/openai/OpenAIChatCompletion.scala#L24) transformer.
 
 We recommend that you [create an Azure Synapse workspace](../../../synapse-analytics/get-started-create-workspace.md). However, you can also use Azure Databricks, Azure HDInsight, Spark on Kubernetes, or the Python environment with the `pyspark` package.
 
@@ -187,15 +187,87 @@ The following image shows example output with completions in Azure Synapse Analy
 
 Here are some other use cases for working with Azure OpenAI Service and large datasets.
 
-### Improve throughput with request batching
+### Generating Text Embeddings
+
+In addition to completing text, we can also embed text for use in downstream algorithms or vector retrieval architectures. Creating embeddings allows you to search and retrieve documents from large collections and can be used when prompt engineering isn't sufficient for the task. For more information on using [OpenAIEmbedding](https://mmlspark.blob.core.windows.net/docs/0.11.1/pyspark/_modules/synapse/ml/cognitive/openai/OpenAIEmbedding.html), see our [embedding guide](https://microsoft.github.io/SynapseML/docs/Explore%20Algorithms/OpenAI/Quickstart%20-%20OpenAI%20Embedding/).
+
+from synapse.ml.services.openai import OpenAIEmbedding
+
+```python
+embedding = (
+    OpenAIEmbedding()
+    .setSubscriptionKey(key)
+    .setDeploymentName(deployment_name_embeddings)
+    .setCustomServiceName(service_name)
+    .setTextCol("prompt")
+    .setErrorCol("error")
+    .setOutputCol("embeddings")
+)
+
+display(embedding.transform(df))
+```
+
+### Chat Completion
+Models such as ChatGPT and GPT-4 are capable of understanding chats instead of single prompts. The [OpenAIChatCompletion](https://mmlspark.blob.core.windows.net/docs/0.11.1/pyspark/_modules/synapse/ml/cognitive/openai/OpenAIChatCompletion.html) transformer exposes this functionality at scale.
+
+```python
+from synapse.ml.services.openai import OpenAIChatCompletion
+from pyspark.sql import Row
+from pyspark.sql.types import *
+
+
+def make_message(role, content):
+    return Row(role=role, content=content, name=role)
+
+
+chat_df = spark.createDataFrame(
+    [
+        (
+            [
+                make_message(
+                    "system", "You are an AI chatbot with red as your favorite color"
+                ),
+                make_message("user", "Whats your favorite color"),
+            ],
+        ),
+        (
+            [
+                make_message("system", "You are very excited"),
+                make_message("user", "How are you today"),
+            ],
+        ),
+    ]
+).toDF("messages")
+
+chat_completion = (
+    OpenAIChatCompletion()
+    .setSubscriptionKey(key)
+    .setDeploymentName(deployment_name)
+    .setCustomServiceName(service_name)
+    .setMessagesCol("messages")
+    .setErrorCol("error")
+    .setOutputCol("chat_completions")
+)
+
+display(
+    chat_completion.transform(chat_df).select(
+        "messages", "chat_completions.choices.message.content"
+    )
+)
+```
+
+### Improve throughput with request batching from OpenAICompletion
 
 You can use Azure OpenAI Service with large datasets to improve throughput with request batching. In the previous example, you make several requests to the service, one for each prompt. To complete multiple prompts in a single request, you can use batch mode.
 
-In the `OpenAICompletion` object definition, you specify the `"batchPrompt"` value to configure the dataframe to use a **batchPrompt** column. Create the dataframe with a list of prompts for each row.
+In the [OpenAItCompletion](https://mmlspark.blob.core.windows.net/docs/0.11.1/pyspark/_modules/synapse/ml/cognitive/openai/OpenAICompletion.html) object definition, you specify the `"batchPrompt"` value to configure the dataframe to use a **batchPrompt** column. Create the dataframe with a list of prompts for each row.
 
 > [!NOTE]
 > There's currently a limit of 20 prompts in a single request and a limit of 2048 tokens, or approximately 1500 words.
 
+> [!NOTE]
+> Currently, request batching is not supported by the `OpenAIChatCompletion()` transformer.
+
 ```python
 batch_df = spark.createDataFrame(
     [
@@ -227,9 +299,6 @@ completed_batch_df = batch_completion.transform(batch_df).cache()
 display(completed_batch_df)
 ```
 
-> [!NOTE]
-> There's currently a limit of 20 prompts in a single request and a limit of 2048 tokens, or approximately 1500 words.
-
 ### Use an automatic mini-batcher
 
 You can use Azure OpenAI Service with large datasets to transpose the data format. If your data is in column format, you can transpose it to row format by using the SynapseML `FixedMiniBatcherTransformer` object.
 
@@ -11,6 +11,7 @@ ms.author: eur
 |Arabic (Saudi Arabia)|`ar-SA` |
 |Chinese (Cantonese, Traditional)|`zh-HK`<sup>1</sup>|
 |Chinese (Mandarin, Simplified)|`zh-CN`|
+|Dutch (Netherlands)|`nl-NL`<sup>1</sup>|
 |English (Australia)|`en-AU`|
 |English (Canada)|`en-CA` |
 |English (India)|`en-IN` |
 
@@ -105,31 +105,7 @@ Speech to text supports two new locales as shown in the following table. Refer t
 
 - Speech [Pronunciation Assessment](../../how-to-pronunciation-assessment.md) now supports 3 additional languages generally available in English (Canada), English (India), and French (Canada), with 3 additional languages available in preview. For more information, see the full [language list for Pronunciation Assessment](../../language-support.md?tabs=pronunciation-assessment).
 
-  | Language | Locale (BCP-47) | 
-  |--|--|
-  |Arabic (Saudi Arabia)|`ar-SA`<sup>1</sup> |
-  |Chinese (Mandarin, Simplified)|`zh-CN`|
-  |English (Australia)|`en-AU`|
-  |English (Canada)|`en-CA` |
-  |English (India)|`en-IN` |
-  |English (United Kingdom)|`en-GB`|
-  |English (United States)|`en-US`|  
-  |French (Canada)|`fr-CA`| 
-  |French (France)|`fr-FR`|  
-  |German (Germany)|`de-DE`|
-  |Italian (Italy)|`it-IT`<sup>1</sup>|
-  |Japanese (Japan)|`ja-JP`|
-  |Korean (Korea)|`ko-KR`<sup>1</sup>|
-  |Malay (Malaysia)|`ms-MY`<sup>1</sup>|
-  |Norwegian Bokmål (Norway)|`nb-NO`<sup>1</sup>|
-  |Portuguese (Brazil)|`pt-BR`<sup>1</sup>|
-  |Russian (Russia)|`ru-RU`<sup>1</sup>|
-  |Spanish (Mexico)|`es-MX` | 
-  |Spanish (Spain)|`es-ES` | 
-  |Tamil (India)|`ta-IN`<sup>1</sup> | 
-  |Vietnamese (Vietnam)|`vi-VN`<sup>1</sup> |
-
-  <sup>1</sup> The language is in public preview for pronunciation assessment.
+ 
 
 ### May 2023 release
 
 
@@ -111,7 +111,7 @@ With the cross-lingual feature, you can transfer your custom neural voice model
 
 # [Pronunciation assessment](#tab/pronunciation-assessment)
 
-The table in this section summarizes the 24 locales supported for pronunciation assessment, and each language is available on all [Speech to text regions](regions.md#speech-service). Latest update extends support from English to 23 additional languages and quality enhancements to existing features, including accuracy, fluency and miscue assessment. You should specify the language that you're learning or practicing improving pronunciation. The default language is set as `en-US`. If you know your target learning language, [set the locale](how-to-pronunciation-assessment.md#get-pronunciation-assessment-results) accordingly. For example, if you're learning British English, you should specify the language as `en-GB`. If you're teaching a broader language, such as Spanish, and are uncertain about which locale to select, you can run various accent models (`es-ES`, `es-MX`) to determine the one that achieves the highest score to suit your specific scenario. 
+The table in this section summarizes the 25 locales supported for pronunciation assessment, and each language is available on all [Speech to text regions](regions.md#speech-service). Latest update extends support from English to 24 additional languages and quality enhancements to existing features, including accuracy, fluency and miscue assessment. You should specify the language that you're learning or practicing improving pronunciation. The default language is set as `en-US`. If you know your target learning language, [set the locale](how-to-pronunciation-assessment.md#get-pronunciation-assessment-results) accordingly. For example, if you're learning British English, you should specify the language as `en-GB`. If you're teaching a broader language, such as Spanish, and are uncertain about which locale to select, you can run various accent models (`es-ES`, `es-MX`) to determine the one that achieves the highest score to suit your specific scenario. 
 
 [!INCLUDE [Language support include](includes/language-support/pronunciation-assessment.md)]
 
 
@@ -191,8 +191,8 @@ Adding a disk to a protected VM | Supported.
 Resizing a disk on a protected VM | Supported.
 Shared storage| Backing up VMs by using Cluster Shared Volumes (CSV) or Scale-Out File Server isn't supported. CSV writers are likely to fail during backup. On restore, disks that contain CSV volumes might not come up.
 [Shared disks](../virtual-machines/disks-shared-enable.md) | Not supported.
-<a name="ultra-disk-backup">Ultra disks</a> | Supported with [Enhanced policy](backup-azure-vms-enhanced-policy.md). The support is currently in preview.       <br><br>    [Supported regions](../virtual-machines/disks-types.md#ultra-disk-limitations).      <br><br>    To enroll your subscription for this feature, [fill this form](https://forms.office.com/r/1GLRnNCntU).       <br><br>    - Configuration of Ultra disk protection is supported via Recovery Services vault only. This configuration is currently not supported via virtual machine blade.       <br><br>    - Cross-region restore is currently not supported for machines using Ultra disks. <br><br> - GRS type vaults cannot be used for enabling backup. 
-<a name="premium-ssd-v2-backup">Premium SSD v2</a> | Supported with [Enhanced policy](backup-azure-vms-enhanced-policy.md). The support is currently in preview.       <br><br>    [Supported regions](../virtual-machines/disks-types.md#regional-availability).       <br><br>    To enroll your subscription for this feature, [fill this form](https://forms.office.com/r/h56TpTc773).       <br><br>    - Configuration of Premium v2 disk protection is supported via Recovery Services vault only. This configuration is currently not supported via virtual machine blade.       <br><br>    - Cross-region restore is currently not supported for machines using Premium v2 disks.  <br><br> - GRS type vaults cannot be used for enabling backup.
+<a name="ultra-disk-backup">Ultra disks</a> | Supported with [Enhanced policy](backup-azure-vms-enhanced-policy.md). The support is currently in preview.       <br><br>    [Supported regions](../virtual-machines/disks-types.md#ultra-disk-limitations).      <br><br>    - The preview can be tested on any subscription and no enrollment is required.       <br><br>    - Configuration of Ultra disk protection is supported via Recovery Services vault and via virtual machine blade.       <br><br>    - Cross-region restore is currently not supported for machines using Ultra disks. <br><br> - GRS type vaults cannot be used for enabling backup. <br><br>    - File-level restore is currently not supported for machines using Ultra disks. 
+<a name="premium-ssd-v2-backup">Premium SSD v2</a> | Supported with [Enhanced policy](backup-azure-vms-enhanced-policy.md). The support is currently in preview.       <br><br>    [Supported regions](../virtual-machines/disks-types.md#regional-availability).     <br><br>    - The preview can be tested on any subscription and no enrollment is required.   <br><br>    - Configuration of Premium SSD v2 disk protection is supported via Recovery Services vault and via virtual machine blade.       <br><br>    - Cross-region restore is currently not supported for machines using Premium v2 disks.  <br><br> - GRS type vaults cannot be used for enabling backup.  <br><br>    - File-level restore is currently not supported for machines using Premium SSD v2 disks. 
 [Temporary disks](../virtual-machines/managed-disks-overview.md#temporary-disk) | Azure Backup doesn't back up temporary disks.
 NVMe/[ephemeral disks](../virtual-machines/ephemeral-os-disks.md) | Not supported.
 [Resilient File System (ReFS)](/windows-server/storage/refs/refs-overview) restore | Supported. Volume Shadow Copy Service (VSS) supports app-consistent backups on ReFS.
 
@@ -19,7 +19,7 @@ The Azure Communication Services Calling SDK enables Teams user devices to drive
 
 Key features of the Calling SDK:
 
-- **Addressing** - Azure Communication Services is using [Microsoft Entra user identifier](/powershell/module/azuread/get-azureaduser) to address communication endpoints. Clients use Microsoft Entra identities to authenticate to the service and communicate with each other. These identities are used in Calling APIs that provide clients visibility into who is connected to a call (the roster). And are also used in [Microsoft Graph API](/graph/api/user-get).
+- **Addressing** - Azure Communication Services is using [Microsoft Entra user identifier](/powershell/module/microsoft.graph.users/get-mguser) to address communication endpoints. Clients use Microsoft Entra identities to authenticate to the service and communicate with each other. These identities are used in Calling APIs that provide clients visibility into who is connected to a call (the roster). And are also used in [Microsoft Graph API](/graph/api/user-get).
 - **Encryption** - The Calling SDK encrypts traffic and prevents tampering on the wire. 
 - **Device Management and Media** - The Calling SDK provides facilities for binding to audio and video devices, encodes content for efficient transmission over the communications data plane, and renders content to output devices and views that you specify. APIs are also provided for screen and application sharing.
 - **Notifications** - The Calling SDK provides APIs that allow clients to be notified of an incoming call. In situations where your app is not running in the foreground, patterns are available to [fire pop-up notifications](../notifications.md) ("toasts") to inform users of an incoming call.