Skip to content

Commit a85d81c

Browse files
authored
Merge pull request #1134 from MicrosoftDocs/main
10/30 11:00 AM IST Publish
2 parents a68723c + 5933c41 commit a85d81c

File tree

6 files changed

+12
-39
lines changed

6 files changed

+12
-39
lines changed

articles/ai-services/content-safety/concepts/custom-categories.md

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -16,38 +16,6 @@ ms.author: pafarley
1616

1717
Azure AI Content Safety lets you create and manage your own content moderation categories for enhanced moderation and filtering that matches your specific policies or use cases.
1818

19-
## Custom categories Training Pipeline Overview
20-
![image](https://github.com/user-attachments/assets/2e097136-0e37-4b5e-ba59-cafcfd733d72)
21-
22-
### Pipeline Components
23-
The training pipeline is designed to leverage a combination of universal data assets, user-provided inputs, and advanced GPT model fine-tuning techniques to produce high-quality models tailored to specific tasks.
24-
#### Data Assets
25-
Filtered Universal Data: This component gathers datasets from multiple domains to create a comprehensive and diverse dataset collection. The goal is to have a robust data foundation that provides a variety of contexts for model training.
26-
User Inputs
27-
Customer Task Metadata: Metadata provided by customers, which defines the specific requirements and context of the task they wish the model to perform.
28-
Customer Demonstrations: Sample demonstrations provided by customers that illustrate the expected output or behavior for the model. These demonstrations help optimize the model’s response based on real-world expectations.
29-
30-
#### Optimized Customer Prompt
31-
Based on the customer metadata and demonstrations, an optimized prompt is generated. This prompt refines the inputs provided to the model, aligning it closely with customer needs and enhancing the model’s task performance.
32-
33-
#### GPTX Synthetic Task-Specific Dataset
34-
Using the optimized prompt and filtered universal data, a synthetic, task-specific dataset is created. This dataset is tailored to the specific task requirements, enabling the model to understand and learn the desired behaviors and patterns.
35-
### Model Training and Fine-Tuning
36-
37-
#### Model Options: The pipeline supports multiple language models (LM), including Zcode, SLM, or any other language model (LM) suitable for the task.
38-
Task-Specific Fine-Tuned Model: The selected language model is fine-tuned on the synthetic task-specific dataset to produce a model that is highly optimized for the specific task.
39-
User Outputs
40-
41-
#### ONNX Model: The fine-tuned model is converted into an ONNX (Open Neural Network Exchange) model format, ensuring compatibility and efficiency for deployment.
42-
Deployment: The ONNX model is deployed, enabling users to make inference calls and access the model’s predictions. This deployment step ensures that the model is ready for production use in customer applications.
43-
Key Features of the Training Pipeline
44-
45-
#### Task Specificity: The pipeline allows for the creation of models finely tuned to specific customer tasks, thanks to the integration of customer metadata and demonstrations.
46-
- Scalability and Flexibility: The pipeline supports multiple language models, providing flexibility in choosing the model architecture best suited to the task.
47-
- Efficiency in Deployment: The conversion to ONNX format ensures that the final model is lightweight and efficient, optimized for deployment environments.
48-
- Continuous Improvement: By using synthetic datasets generated from diverse universal data sources, the pipeline can continuously improve model quality and applicability across various domains.
49-
50-
5119
## Types of customization
5220

5321
There are multiple ways to define and use custom categories, which are detailed and compared in this section.
@@ -82,8 +50,6 @@ This implementation works on text content and image content.
8250
## How it works
8351

8452
### [Custom categories (standard) API](#tab/standard)
85-
![image](https://github.com/user-attachments/assets/5c377ec4-379b-4b41-884c-13524ca126d0)
86-
8753

8854
The Azure AI Content Safety custom categories feature uses a multi-step process for creating, training, and using custom content classification models. Here's a look at the workflow:
8955

articles/ai-services/content-safety/quickstart-custom-categories.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ curl -X PUT "<your_endpoint>/contentsafety/text/categories/survival-advice?api-v
9696
Replace <your_api_key> and <your_endpoint> with your own values, and also **append the version number you obtained from the last step.** Allow enough time for model training: the end-to-end execution of custom category training can take from around five hours to ten hours. Plan your moderation pipeline accordingly. After you receive the response, store the operation ID (referred to as `id`) in a temporary location. This ID will be necessary for retrieving the build status using the **Get status** API in the next section.
9797

9898
```bash
99-
curl -X POST "<your_endpoint>/contentsafety/text/categories/survival-advice:build?api-version=2024-09-15-preview**&version={version}**" \
99+
curl -X POST "<your_endpoint>/contentsafety/text/categories/survival-advice:build?api-version=2024-09-15-preview&version={version}" \
100100
-H "Ocp-Apim-Subscription-Key: <your_api_key>" \
101101
-H "Content-Type: application/json"
102102
```

articles/ai-services/speech-service/includes/language-support/tts-cnv.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ ms.author: eur
3939
| `es-CO` |Spanish (Colombia)| Custom neural voice Pro |
4040
| `es-ES` | Spanish (Spain) | Custom neural voice Pro<br/><br/>Custom neural voice lite (Preview)<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
4141
| `es-MX` | Spanish (Mexico) | Custom neural voice Pro<br/><br/>Custom neural voice lite (Preview)<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
42+
| `es-US` | Spanish (United States)| Custom neural voice Pro|
4243
| `fi-FI` | Finnish (Finland) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
4344
| `fr-BE` | French (Belgium) | Custom neural voice Pro<br/><br/>Cross-lingual voice target |
4445
| `fr-CA` | French (Canada) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
@@ -64,7 +65,8 @@ ms.author: eur
6465
| `sk-SK` | Slovak (Slovakia) | Custom neural voice Pro<br/><br/>Multi-style voice |
6566
| `sl-SI` | Slovenian (Slovenia) | Custom neural voice Pro |
6667
| `sv-SE` | Swedish (Sweden) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |
67-
| `ta-IN` | Tamil (India) | Custom neural voice Pro<br/><br/>Cross-lingual voice target |
68+
| `ta-IN` | Tamil (India) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target |
69+
| `ta-MY` |Tamil (Malaysia)| Custom neural voice Pro|
6870
| `te-IN` | Telugu (India) | Custom neural voice Pro |
6971
| `th-TH` | Thai (Thailand) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice|
7072
| `tr-TR` | Turkish (Türkiye) | Custom neural voice Pro<br/><br/>Cross-lingual voice source and target<br/><br/>Multi-style voice |

articles/ai-services/speech-service/includes/release-notes/release-notes-tts.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ Azure AI speech high definition (HD) voices are available in public preview. The
5555
- Custom neural voice Pro now supports the following new locales:
5656
- `en-NZ`: English (New Zealand)
5757
- `es-CL`: Spanish (Chile)
58+
- `es-US`: Spanish (United States)
59+
- `ta-MY`: Tamil (Malaysia)
5860

5961
See the [language list for Custom neural voice](../../language-support.md?tabs=tts#custom-neural-voice) for the full list of supported locales.
6062

@@ -75,6 +77,7 @@ Azure AI speech high definition (HD) voices are available in public preview. The
7577
| `pt-PT` | Portuguese (Portugal) |
7678
| `sv-SE` | Swedish (Sweden) |
7779
| `tr-TR` | Turkish (Türkiye) |
80+
| `ta-IN` | Tamil (India) |
7881
| `zh-HK` | Chinese (Cantonese, Traditional) |
7982

8083
See the [language list for Custom neural voice](../../language-support.md?tabs=tts#custom-neural-voice) for the full list of supported locales.

articles/search/cognitive-search-skill-textsplit.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
88
ms.custom:
99
- ignite-2023
1010
ms.topic: reference
11-
ms.date: 10/01/2024
11+
ms.date: 10/29/2024
1212
---
1313

1414
# Text split cognitive skill
@@ -67,7 +67,7 @@ Parameters are case-sensitive.
6767
"textSplitMode": "pages",
6868
"unit": "azureOpenAITokens",
6969
"azureOpenAITokenizerParameters":{
70-
"encoderModelName":"cl100k",
70+
"encoderModelName":"cl100k_base",
7171
"allowedSpecialTokens": [
7272
"[START]",
7373
"[END]"

articles/search/vector-search-integrated-vectorization-ai-studio.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-search
88
ms.custom:
99
- build-2024
1010
ms.topic: how-to
11-
ms.date: 05/08/2024
11+
ms.date: 10/29/2024
1212
---
1313

1414
# How to implement integrated vectorization using models from Azure AI Studio
@@ -161,6 +161,8 @@ You must add the `/v1/embed` path onto the end of the URL that you copied from y
161161

162162
The URI and key are generated when you deploy the model from the catalog. For more information about these values, see [How to deploy Cohere Embed models with Azure AI Studio](/azure/ai-studio/how-to/deploy-models-cohere-embed).
163163

164+
Note that image URIs are not supported by this integration at this time.
165+
164166
```json
165167
{
166168
"@odata.type": "#Microsoft.Skills.Custom.AmlSkill",

0 commit comments

Comments
 (0)