Merge pull request #1134 from MicrosoftDocs/main

Saisang · web-flow · commit a85d81cd667c · 2024-10-30T13:33:29.000+08:00
10/30 11:00 AM IST Publish
diff --git a/articles/ai-services/content-safety/concepts/custom-categories.md b/articles/ai-services/content-safety/concepts/custom-categories.md
@@ -16,38 +16,6 @@ ms.author: pafarley
 
 Azure AI Content Safety lets you create and manage your own content moderation categories for enhanced moderation and filtering that matches your specific policies or use cases.
 
-## Custom categories Training Pipeline Overview
-![image](https://github.com/user-attachments/assets/2e097136-0e37-4b5e-ba59-cafcfd733d72)
-
-### Pipeline Components
-The training pipeline is designed to leverage a combination of universal data assets, user-provided inputs, and advanced GPT model fine-tuning techniques to produce high-quality models tailored to specific tasks.
-#### Data Assets
-Filtered Universal Data: This component gathers datasets from multiple domains to create a comprehensive and diverse dataset collection. The goal is to have a robust data foundation that provides a variety of contexts for model training.
-User Inputs
-Customer Task Metadata: Metadata provided by customers, which defines the specific requirements and context of the task they wish the model to perform.
-Customer Demonstrations: Sample demonstrations provided by customers that illustrate the expected output or behavior for the model. These demonstrations help optimize the model’s response based on real-world expectations.
-
-#### Optimized Customer Prompt
-Based on the customer metadata and demonstrations, an optimized prompt is generated. This prompt refines the inputs provided to the model, aligning it closely with customer needs and enhancing the model’s task performance.
-
-#### GPTX Synthetic Task-Specific Dataset
-Using the optimized prompt and filtered universal data, a synthetic, task-specific dataset is created. This dataset is tailored to the specific task requirements, enabling the model to understand and learn the desired behaviors and patterns.
-### Model Training and Fine-Tuning
-
-#### Model Options: The pipeline supports multiple language models (LM), including Zcode, SLM, or any other language model (LM) suitable for the task.
-Task-Specific Fine-Tuned Model: The selected language model is fine-tuned on the synthetic task-specific dataset to produce a model that is highly optimized for the specific task.
-User Outputs
-
-#### ONNX Model: The fine-tuned model is converted into an ONNX (Open Neural Network Exchange) model format, ensuring compatibility and efficiency for deployment.
-Deployment: The ONNX model is deployed, enabling users to make inference calls and access the model’s predictions. This deployment step ensures that the model is ready for production use in customer applications.
-Key Features of the Training Pipeline
-
-#### Task Specificity: The pipeline allows for the creation of models finely tuned to specific customer tasks, thanks to the integration of customer metadata and demonstrations.
-- Scalability and Flexibility: The pipeline supports multiple language models, providing flexibility in choosing the model architecture best suited to the task.
-- Efficiency in Deployment: The conversion to ONNX format ensures that the final model is lightweight and efficient, optimized for deployment environments.
-- Continuous Improvement: By using synthetic datasets generated from diverse universal data sources, the pipeline can continuously improve model quality and applicability across various domains.
-
-
 ## Types of customization
 
 There are multiple ways to define and use custom categories, which are detailed and compared in this section.
@@ -82,8 +50,6 @@ This implementation works on text content and image content.
 ## How it works
 
 ### [Custom categories (standard) API](#tab/standard)
-![image](https://github.com/user-attachments/assets/5c377ec4-379b-4b41-884c-13524ca126d0)
-
 
 The Azure AI Content Safety custom categories feature uses a multi-step process for creating, training, and using custom content classification models. Here's a look at the workflow:
 
diff --git a/articles/ai-services/content-safety/quickstart-custom-categories.md b/articles/ai-services/content-safety/quickstart-custom-categories.md
@@ -96,7 +96,7 @@ curl -X PUT "<your_endpoint>/contentsafety/text/categories/survival-advice?api-v
 Replace <your_api_key> and <your_endpoint> with your own values, and also **append the version number you obtained from the last step.** Allow enough time for model training: the end-to-end execution of custom category training can take from around five hours to ten hours. Plan your moderation pipeline accordingly. After you receive the response, store the operation ID (referred to as `id`) in a temporary location. This ID will be necessary for retrieving the build status using the **Get status** API in the next section.
 
 ```bash
-curl -X POST "<your_endpoint>/contentsafety/text/categories/survival-advice:build?api-version=2024-09-15-preview**&version={version}**" \
+curl -X POST "<your_endpoint>/contentsafety/text/categories/survival-advice:build?api-version=2024-09-15-preview&version={version}" \
      -H "Ocp-Apim-Subscription-Key: <your_api_key>" \
      -H "Content-Type: application/json"
 ```
diff --git a/articles/ai-services/speech-service/includes/language-support/tts-cnv.md b/articles/ai-services/speech-service/includes/language-support/tts-cnv.md
@@ -39,6 +39,7 @@ ms.author: eur
 | `es-CO` |Spanish (Colombia)| Custom neural voice Pro |
 | `es-ES` | Spanish (Spain) | Custom neural voice Pro<br/><br/>Custom neural voice lite (Preview)<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
 | `es-MX` | Spanish (Mexico) | Custom neural voice Pro<br/><br/>Custom neural voice lite (Preview)<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
+| `es-US` | Spanish (United States)| Custom neural voice Pro|
 | `fi-FI` | Finnish (Finland) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
 | `fr-BE` | French (Belgium) | Custom neural voice Pro<br/><br/>Cross-lingual voice target |
 | `fr-CA` | French (Canada) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice  |
@@ -64,7 +65,8 @@ ms.author: eur
 | `sk-SK` | Slovak (Slovakia) | Custom neural voice Pro<br/><br/>Multi-style voice |
 | `sl-SI` | Slovenian (Slovenia) | Custom neural voice Pro |
 | `sv-SE` | Swedish (Sweden) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
-| `ta-IN` | Tamil (India) | Custom neural voice Pro<br/><br/>Cross-lingual voice target |
+| `ta-IN` | Tamil (India) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target |
+| `ta-MY` |Tamil (Malaysia)| Custom neural voice Pro|
 | `te-IN` | Telugu (India) | Custom neural voice Pro |
 | `th-TH` | Thai (Thailand) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice|
 | `tr-TR` | Turkish (Türkiye) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
diff --git a/articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md b/articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md
@@ -55,6 +55,8 @@ Azure AI speech high definition (HD) voices are available in public preview. The
 - Custom neural voice Pro now supports the following new locales:
   - `en-NZ`: English (New Zealand)
   - `es-CL`: Spanish (Chile)
+  - `es-US`: Spanish (United States)
+  - `ta-MY`: Tamil (Malaysia)
   
   See the [language list for Custom neural voice](../../language-support.md?tabs=tts#custom-neural-voice) for the full list of supported locales.  
 
@@ -75,6 +77,7 @@ Azure AI speech high definition (HD) voices are available in public preview. The
   | `pt-PT`               | Portuguese (Portugal)     |
   | `sv-SE`               | Swedish (Sweden)          |
   | `tr-TR`               | Turkish (Türkiye)          |
+  | `ta-IN`               | Tamil (India) |
   | `zh-HK`               | Chinese (Cantonese, Traditional)       |  
 
   See the [language list for Custom neural voice](../../language-support.md?tabs=tts#custom-neural-voice) for the full list of supported locales.  
diff --git a/articles/search/cognitive-search-skill-textsplit.md b/articles/search/cognitive-search-skill-textsplit.md
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
 ms.custom:
   - ignite-2023
 ms.topic: reference
-ms.date: 10/01/2024
+ms.date: 10/29/2024
 ---
 
 # Text split cognitive skill
@@ -67,7 +67,7 @@ Parameters are case-sensitive.
     "textSplitMode": "pages", 
     "unit": "azureOpenAITokens", 
     "azureOpenAITokenizerParameters":{ 
-        "encoderModelName":"cl100k", 
+        "encoderModelName":"cl100k_base", 
         "allowedSpecialTokens": [ 
             "[START]", 
             "[END]" 
diff --git a/articles/search/vector-search-integrated-vectorization-ai-studio.md b/articles/search/vector-search-integrated-vectorization-ai-studio.md
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
 ms.custom:
   - build-2024
 ms.topic: how-to
-ms.date: 05/08/2024
+ms.date: 10/29/2024
 ---
 
 # How to implement integrated vectorization using models from Azure AI Studio
@@ -161,6 +161,8 @@ You must add the `/v1/embed` path onto the end of the URL that you copied from y
 
 The URI and key are generated when you deploy the model from the catalog. For more information about these values, see [How to deploy Cohere Embed models with Azure AI Studio](/azure/ai-studio/how-to/deploy-models-cohere-embed).
 
+Note that image URIs are not supported by this integration at this time.
+
 ```json
 {
   "@odata.type": "#Microsoft.Skills.Custom.AmlSkill",