MicrosoftDocs
diff --git a/‎articles/ai-services/openai/reference.md
Lines changed: 25 additions & 2 deletions b/‎articles/ai-services/openai/reference.md
Lines changed: 25 additions & 2 deletions
diff --git a/‎articles/ai-services/speech-service/embedded-speech-performance-evaluations.md
Lines changed: 82 additions & 0 deletions b/‎articles/ai-services/speech-service/embedded-speech-performance-evaluations.md
Lines changed: 82 additions & 0 deletions
diff --git a/‎articles/ai-services/speech-service/embedded-speech.md
Lines changed: 1 addition & 1 deletion b/‎articles/ai-services/speech-service/embedded-speech.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-services/speech-service/toc.yml
Lines changed: 5 additions & 1 deletion b/‎articles/ai-services/speech-service/toc.yml
Lines changed: 5 additions & 1 deletion
diff --git a/‎articles/ai-studio/how-to/data-add.md
Lines changed: 6 additions & 6 deletions b/‎articles/ai-studio/how-to/data-add.md
Lines changed: 6 additions & 6 deletions
diff --git a/‎articles/ai-studio/how-to/flow-deploy.md
Lines changed: 1 addition & 1 deletion b/‎articles/ai-studio/how-to/flow-deploy.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-studio/how-to/simulator-interaction-data.md
Lines changed: 1 addition & 1 deletion b/‎articles/ai-studio/how-to/simulator-interaction-data.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/ai-studio/includes/evaluations/from-data/python.md
Lines changed: 1 addition & 1 deletion b/‎articles/ai-studio/includes/evaluations/from-data/python.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/api-management/breaking-changes/stv1-platform-retirement-august-2024.md
Lines changed: 4 additions & 3 deletions b/‎articles/api-management/breaking-changes/stv1-platform-retirement-august-2024.md
Lines changed: 4 additions & 3 deletions
diff --git a/‎articles/azure-relay/includes/relay-create-hybrid-connection-portal.md
Lines changed: 1 addition & 1 deletion b/‎articles/azure-relay/includes/relay-create-hybrid-connection-portal.md
Lines changed: 1 addition & 1 deletion
@@ -486,7 +486,6 @@ The following parameters can be used inside of the `parameters` field inside of
 |--|--|--|--|--|
 | `type` | string | Required | null | The data source to be used for the Azure OpenAI on your data feature. For Azure AI Search the value is `AzureCognitiveSearch`. For Azure Cosmos DB for MongoDB vCore, the value is `AzureCosmosDB`. |
 | `indexName` | string | Required | null | The search index to be used. |
-| `fieldsMapping` | dictionary | Optional for Azure AI Search. Required for Azure Cosmos DB for MongoDB vCore.  | null | Index data column mapping. When using Azure Cosmos DB for MongoDB vCore, the value `vectorFields` is required, which indicates the fields that store vectors.  |
 | `inScope` | boolean | Optional | true | If set, this value will limit responses specific to the grounding data content.  |
 | `topNDocuments` | number | Optional | 5 | Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. This is the *retrieved documents* parameter in Azure OpenAI studio.   |
 | `semanticConfiguration` | string | Optional | null |  The semantic search configuration. Only required when `queryType` is set to `semantic` or  `vectorSemanticHybrid`.  |
@@ -498,13 +497,36 @@ The following parameters can be used inside of the `parameters` field inside of
 | `strictness` | number | Optional | 3 | Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. |
 
 
-**The following parameters are used for Azure AI Search only**
+**The following parameters are used for Azure AI Search**
 
 | Parameters | Type | Required? | Default | Description |
 |--|--|--|--|--|
 | `endpoint` | string | Required | null | Azure AI Search only. The data source endpoint. |
 | `key` | string | Required | null | Azure AI Search only. One of the Azure AI Search admin keys for your service. |
 | `queryType` | string | Optional | simple |  Indicates which query option will be used for Azure AI Search. Available types: `simple`, `semantic`, `vector`, `vectorSimpleHybrid`, `vectorSemanticHybrid`. |
+| `fieldsMapping` | dictionary | Optional for Azure AI Search.  | null | defines which [fields](./concepts/use-your-data.md?tabs=ai-search#index-field-mapping) you want to map when you add your data source. |
+
+The following parameters are used inside of the `fieldsMapping` field.
+
+| Parameters | Type | Required? | Default | Description |
+|--|--|--|--|--|
+| `titleField` | string | Optional  | null | The field in your index that contains the original title of each document. |
+| `urlField` | string | Optional  | null | The field in your index that contains the original URL of each document. |
+| `filepathField` | string | Optional  | null | The field in your index that contains the original file name of each document. |
+| `contentFields` | dictionary | Optional  | null | The fields in your index that contain the main text content of each document. |
+| `contentFieldsSeparator` | string | Optional  | null | The separator for the your content fields. Use `\n` by default.  |
+
+```json
+"fieldsMapping": {
+  "titleField": "myTitleField",
+  "urlField": "myUrlField",
+  "filepathField": "myFilePathField",
+  "contentFields": [
+    "myContentField"
+  ],
+  "contentFieldsSeparator": "\n"
+}
+```
 
 **The following parameters are used for Azure Cosmos DB for MongoDB vCore**
 
@@ -516,6 +538,7 @@ The following parameters can be used inside of the `parameters` field inside of
 | `containerName` | string | Required | null | Azure Cosmos DB for MongoDB vCore only. The Azure Cosmos Mongo vCore container name in the database. |
 | `type` (found inside of`embeddingDependencyType`) | string | Required | null | Indicates the embedding model dependency. |
 | `deploymentName` (found inside of`embeddingDependencyType`) | string | Required | null | The embedding model deployment name. |
+| `fieldsMapping` | dictionary | Required for Azure Cosmos DB for MongoDB vCore.  | null | Index data column mapping. When using Azure Cosmos DB for MongoDB vCore, the value `vectorFields` is required, which indicates the fields that store vectors.  |
 
 ### Start an ingestion job 
 
 
@@ -0,0 +1,82 @@
+---
+title: Performance evaluations for Embedded Speech - Speech service
+titleSuffix: Azure AI services
+description: Learn how to evaluate performance of embedded speech models on your target devices.
+author: eric-urban
+manager: nitinme
+ms.service: azure-ai-speech
+ms.topic: how-to
+ms.date: 11/28/2023
+ms.author: eur
+---
+
+# Evaluating performance of Embedded Speech
+
+Embedded speech models run fully on your target devices. Understanding the performance characteristics of these models on your devices’ hardware can be critical to delivering low latency experiences within your products and applications. This guide provides information to help answer the question, "Is my device suitable to run embedded speech to text and speech translation models?"
+
+## Metrics & terminology
+
+**Real-time factor (RTF)** – The real-time factor (RTF) of a device measures how fast the embedded speech model can process audio input. It's the ratio of the processing time to the audio length. For example, if a device processes a 1-minute audio file in 30 seconds, the RTF is 0.5. This metric evaluates the computational power of the device for running embedded speech models. It can help identify devices that are too slow to support the models. Measurement of this metric should only be done using file-based input rather than real-time microphone input.  
+
+To support real-time & interactive speech experiences, the device should have an RTF of `1` or lower. An RTF value higher than `1` means that the device can't keep up with the audio input and will cause poor user experiences. 
+
+When measuring the RTF of a device, it's important to measure multiple samples and analyze the distribution across percentiles. This allows you to capture the effect of variations in the device's behavior like different CPU clock speeds due to thermal throttling. The predefined measurement tests outlined in [Measuring the real-time factor on your device](#measuring-the-real-time-factor-on-your-device) automatically measure the RTF for each speech recognition result, yielding a sufficiently large sample size. 
+
+**User-perceived latency (UPL)** – The user-perceived latency (UPL) of speech to text is the time between a word being spoken and the word being shown in the recognition results.
+
+## Factors that affect performance 
+
+**Device specifications** – The specifications of your device play a key role in whether embedded speech models can run without performance issues. CPU clock speed, architecture (for example, x64, ARM processor, etcetera), and memory can all affect model inference speed.  
+
+**CPU load** – In most cases, your device is running other applications in parallel to the application where embedded speech models are integrated. The amount of CPU load your device experiences when idle and at peak can also affect performance. 
+
+For example, if the device is under moderate to high CPU load from all other applications running on the device, it's possible to encounter performance issues for running embedded speech in addition to the other applications, even with a powerful processor.  
+
+**Memory load** – An embedded speech to text model consumes between 200-300 MB of memory at runtime. If your device has less memory available than that for the embedded speech process to use, frequent fallbacks to virtual memory and paging can introduce more latencies. This can affect both the real-time factor and user-perceived latency. 
+
+## Built-in performance optimizations 
+
+All embedded speech to text models come with a Voice Activity Detector (VAD) component that aims to filter out silence and non-speech content from the audio input. The goal is to reduce the CPU load and the processing time for other speech to text model components. 
+
+The VAD component is always on and doesn't need any configuration from you as the developer. It works best when the audio input has non-negligible amounts of silence or non-speech content, which is common in scenarios like captioning, commanding, and dictation. 
+
+## Measuring the real-time factor on your device 
+
+For all embedded speech supported platforms, a code sample is available on GitHub that includes a performance measurement mode. In this mode, the goal is to measure the real-time factor (RTF) of your device by controlling as many variables as possible: 
+
+- **Model** – The English (United States) model is used for measurement. Models for all other supported locales follow similar performance characteristics, so measuring through the English (United States) model is sufficient.  
+
+- **Audio input** – A prebuilt audio file designed for RTF measurements is made available as a supplemental download to the sample code. 
+
+- **Measurement mechanism** – The start and stop markers of time measurement are preconfigured in the sample to ensure accuracy and ease of comparing results across different devices & test iterations. 
+
+This measurement should be done with the sample running directly on your target device(s), with no code changes other than specifying your model paths & encryption key. The device should be in a state that represents a real end-user state when embedded speech would be used (for example, other active applications, CPU & memory load, etc.).  
+
+Running the sample yields performance metrics outputted to the console. The full suite of metrics includes the real-time factor along with other properties like CPU usage and memory consumption. Each metric is defined and explained below.
+
+### Instruction set metrics
+
+| Metric | Description | Notes |
+|--------|-------------|-------|
+| AVX512Supported | True if CPU supports AVX512 instruction set. | This flag is for X64 platforms. ONNX runtime has optimizations for the various instruction sets, and having this information can help diagnose inconsistencies. |
+| AVXSupported | True if CPU supports AVX instruction set. | This flag is for X64 platforms. ONNX runtime has optimizations for the various instruction sets, and having this information can help diagnose inconsistencies. |
+| AVX2Supported  | True if CPU supports AVX2 instruction set. | This flag is for X64 platforms. ONNX runtime has optimizations for the various instruction sets, and having this information can help diagnose inconsistencies. |
+| SSE3Available  | True if CPU supports SSE3 instruction set. | This flag is for X64 platforms. ONNX runtime has optimizations for the various instruction sets, and having this information can help diagnose inconsistencies. |
+| NEONAvailable  | True if CPU supports NEON instruction set. | This flag is for ARM processor platforms. ONNX runtime has optimizations for the various instruction sets, and having this information can help diagnose inconsistencies. |
+| NPU  | Name of Neural Processing Unit, or N/A if none found. | This flag is for hardware acceleration. |
+
+### Memory metrics
+
+| Metric | Description | Notes |
+|--------|-------------|-------|
+| PagefileUsage | Amount of page file used by process. Implemented for Linux and Windows. | Values are relative to the machine configuration. |
+| WorkingSetSize | Amount of memory used for the process. | |
+| ProcessCPUUsage | Aggregate of CPU usage for the process. | Includes all threads in the process, including Speech SDK and UI threads. Aggregated across all cores. |
+| ThreadCPUUsage | Aggregate of CPU usage for the speech recognition or speech translation thread. | |
+
+### Performance metrics
+
+| Metric | Description | Notes |
+|--------|-------------|-------|
+| RealTimeFactor | Measures how much faster than real-time the embedded speech engine is processing audio. Includes audio loading time. | Values greater than `1` indicate that the engine is processing audio slower than real-time. Values less than `1` indicate the engine is processing audio faster than real-time. This value should only be analyzed in file-based input mode. It shouldn't be analyzed in streaming input mode. |
+| StreamingRealTimeFactor | Measures how much faster than real-time the engine is processing audio. Excludes audio loading time.  | Values greater than `1` indicate that the engine is processing audio slower than real-time. Values less than `1` indicate the engine is processing audio faster than real-time. |
@@ -21,7 +21,7 @@ Embedded Speech is designed for on-device [speech to text](speech-to-text.md) an
 
 ## Platform requirements
 
-Embedded speech is included with the Speech SDK (version 1.24.1 and higher) for C#, C++, and Java. Refer to the general [Speech SDK installation requirements](#embedded-speech-sdk-packages) for programming language and target platform specific details.
+Embedded speech is included with the Speech SDK (version 1.24.1 and higher) for C#, C++, and Java. Refer to the general [Speech SDK installation requirements](quickstarts/setup-platform.md#platform-requirements) for programming language and target platform specific details.
 
 **Choose your target environment**
 
 
@@ -449,7 +449,11 @@ items:
       href: swagger-documentation.md
       displayName: rest, swagger
     - name: Embedded Speech
-      href: embedded-speech.md
+      items:
+        - name: Overview
+          href: embedded-speech.md
+        - name: Performance evaluations
+          href: embedded-speech-performance-evaluations.md
     - name: Power automate batch transcription
       href: power-automate-batch-transcription.md
     - name: Containers
 
@@ -114,7 +114,7 @@ These steps explain how to create a File typed data in the Azure AI Studio:
 To create a data that is a File type, use the following code and update the `<>` placeholders with your information.
 
 ```python
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import Data
 from azure.ai.generative.constants import AssetTypes
 from azure.identity import DefaultAzureCredential
@@ -174,7 +174,7 @@ Use these steps to create a Folder typed data in the Azure AI Studio:
 To create a data that is a Folder type use the following code and update the `<>` placeholders with your information.
 
 ```python
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import Data
 from azure.ai.generative.constants import AssetTypes
 from azure.identity import DefaultAzureCredential
@@ -240,7 +240,7 @@ To archive *all versions* of the data under a given name, use:
 # [Python SDK](#tab/python)
 
 ```python
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import Data
 from azure.ai.generative.constants import AssetTypes
 from azure.identity import DefaultAzureCredential
@@ -265,7 +265,7 @@ To archive a specific data version, use:
 # [Python SDK](#tab/python)
 
 ```python
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import Data
 from azure.ai.generative.constants import AssetTypes
 from azure.identity import DefaultAzureCredential
@@ -294,7 +294,7 @@ To restore *all versions* of the data under a given name, use:
 # [Python SDK](#tab/python)
 
 ```python
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import Data
 from azure.ai.generative.constants import AssetTypes
 from azure.identity import DefaultAzureCredential
@@ -321,7 +321,7 @@ To restore a specific data version, use:
 # [Python SDK](#tab/python)
 
 ```python
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import Data
 from azure.ai.generative.constants import AssetTypes
 from azure.identity import DefaultAzureCredential
 
@@ -83,7 +83,7 @@ You can use the Azure AI Generative SDK to deploy a prompt flow as an online end
 
 ```python
 # Import required dependencies 
-from azure.ai.generative import AIClient 
+from azure.ai.resources.client import AIClient 
 from azure.ai.generative.entities.deployment import Deployment 
 from azure.ai.generative.entities.models import PromptflowModel 
 from azure.identity import InteractiveBrowserCredential as Credential 
 
@@ -37,7 +37,7 @@ First we set up the system large language model, which acts as the "agent" simul
 
 ```python
 from azure.identity import DefaultAzureCredential
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 from azure.ai.generative.entities import AzureOpenAIModelConfiguration
 
 credential = DefaultAzureCredential()
 
@@ -134,7 +134,7 @@ Before you call the `evaluate()` function, your environment needs to set up your
 
 ```python
 from azure.identity import DefaultAzureCredential
-from azure.ai.generative import AIClient
+from azure.ai.resources.client import AIClient
 
 client = AIClient.from_config(DefaultAzureCredential())
 ```
 
@@ -22,7 +22,7 @@ The following table summarizes the compute platforms currently used for instance
 | `stv1` |  Single-tenant v1 | Azure-allocated compute infrastructure |  Developer, Basic, Standard, Premium | 
 | `mtv1` | Multi-tenant v1 |  Shared infrastructure that supports native autoscaling and scaling down to zero in times of no traffic |  Consumption |
 
-For continued support and to take advantage of upcoming features, customers must migrate their Azure API Management instances from the `stv1` compute platform to the `stv2` compute platform. The `stv2` compute platform comes with additional features and improvements such as support for Azure Private Link and other networking features. 
+**For continued support and to take advantage of upcoming features, customers must [migrate](../migrate-stv1-to-stv2.md) their Azure API Management instances from the `stv1` compute platform to the `stv2` compute platform.** The `stv2` compute platform comes with additional features and improvements such as support for Azure Private Link and other networking features. 
 
 New instances created in service tiers other than the Consumption tier are mostly hosted on the `stv2` platform already. Existing instances on the `stv1` compute platform will continue to work normally until the retirement date, but those instances won’t receive the latest features available to the `stv2` platform. Support for `stv1` instances will be retired by 31 August 2024.  
 
@@ -43,10 +43,11 @@ Support for API Management instances hosted on the `stv1` platform will be retir
 
 **Migrate all your existing instances hosted on the `stv1` compute platform to the `stv2` compute platform by 31 August 2024.**  
 
-If you have existing instances hosted on the `stv1` platform, follow our [migration guide](../migrate-stv1-to-stv2.md) to ensure a successful migration. 
+If you have existing instances hosted on the `stv1` platform, follow our **[migration guide](../migrate-stv1-to-stv2.md)** to ensure a successful migration. 
 
 [!INCLUDE [api-management-migration-support](../../../includes/api-management-migration-support.md)]
 
 ## Related content
 
-See all [upcoming breaking changes and feature retirements](overview.md).
+* [Migrate from stv1 platform to stv2](../migrate-stv1-to-stv2.md)
+* See all [upcoming breaking changes and feature retirements](overview.md).
@@ -2,7 +2,7 @@
 author: spelluru
 ms.service: service-bus-relay
 ms.topic: include
-ms.date: 08/10/2023
+ms.date: 01/04/2024
 ms.author: spelluru
 ---
 On the **Relay** page for your namespace, follow these steps to create a hybrid connection.