Skip to content

Commit 2e2a5ec

Browse files
authored
Merge pull request #4649 from laujan/patch-1
Update overview.md
2 parents 8c6c8aa + ac9a60f commit 2e2a5ec

22 files changed

+150
-184
lines changed

articles/ai-services/translator/custom-translator/azure-ai-foundry/beginners-guide.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: Azure AI Translator Custom translation for beginners
2+
title: Azure AI Translator custom translation for beginners
33
titleSuffix: Azure AI services
4-
description: User guide for understanding the end-to-end customized machine translation process.
4+
description: User guide for understanding the end-to-end customized machine translation process using Azure AI Foundry.
55
author: laujan
66
manager: nitinme
77
ms.service: azure-ai-translator
@@ -10,7 +10,7 @@ ms.date: 05/19/2025
1010
ms.topic: overview
1111
---
1212

13-
# Azure AI Translator Custom translation for beginners
13+
# Azure AI Translator custom translation for beginners
1414

1515
[Custom translation](overview.md) enables you to a build translation system that reflects your business, industry, and domain-specific terminology and style. Training and deploying a custom system is easy and doesn't require any programming skills. The customized translation system seamlessly integrates into your existing applications, workflows, and websites and is available on Azure through the same cloud-based [Microsoft Text Translation API](../../text-translation/reference/v4/translate-api.md) service that powers billions of translations every day.
1616

@@ -84,7 +84,7 @@ A BLEU score is a number between zero and 100. A score of zero indicates a low q
8484

8585
## What happens if I don't submit tuning or testing data?
8686

87-
Tuning and test sentences are optimally representative of what you plan to translate in the future. If you don't submit any tuning or testing data, Custom translation automatically excludes sentences from your training documents to use as tuning and test data.
87+
Tuning and test sentences are optimally representative of what you plan to translate in the future. If you don't submit any tuning or testing data, custom translation automatically excludes sentences from your training documents to use as tuning and test data.
8888

8989
| System-generated | Manual-selection |
9090
|---|---|
@@ -93,13 +93,13 @@ Tuning and test sentences are optimally representative of what you plan to trans
9393
| Easy to redo when you grow or shrink the domain. | Allows for more data and better domain coverage.|
9494
|Changes each training run.| Remains static over repeated training runs|
9595

96-
## How is training material processed by Custom translation?
96+
## How is training material processed by custom translation?
9797

98-
To prepare for training, documents undergo a series of processing and filtering steps. Knowledge of the filtering process can help with understanding the sentence count displayed as well as the steps you can take to prepare training documents for training with Custom translation. The filtering steps are as follows:
98+
To prepare for training, documents undergo a series of processing and filtering steps. Knowledge of the filtering process can help with understanding the sentence count displayed as well as the steps you can take to prepare training documents for training with custom translation. The filtering steps are as follows:
9999

100100
* ### Sentence alignment
101101

102-
If your document isn't in `XLIFF`, `XLSX`, `TMX`, or `ALIGN` format, Custom translation aligns the sentences of your source and target documents to each other, sentence-by-sentence. Translator doesn't perform document alignment—it follows your naming convention for the documents to find a matching document in the other language. Within the source text, Custom translation tries to find the corresponding sentence in the target language. It uses document markup like embedded HTML tags to help with the alignment.
102+
If your document isn't in `XLIFF`, `XLSX`, `TMX`, or `ALIGN` format, custom translation aligns the sentences of your source and target documents to each other, sentence-by-sentence. Translator doesn't perform document alignment—it follows your naming convention for the documents to find a matching document in the other language. Within the source text, custom translation tries to find the corresponding sentence in the target language. It uses document markup like embedded HTML tags to help with the alignment.
103103

104104
If you see a large discrepancy between the number of sentences in the source and target documents, your source document can't be parallel, or couldn't be aligned. The document pairs with a large difference (>10%) of sentences on each side warrant a second look to make sure they're indeed parallel.
105105

@@ -139,11 +139,11 @@ To prepare for training, documents undergo a series of processing and filtering
139139

140140
* ### Invalid characters
141141

142-
Custom translation removes sentences that contain Unicode character U+FFFD. The character U+FFFD indicates a failed encoding conversion.
142+
custom translation removes sentences that contain Unicode character U+FFFD. The character U+FFFD indicates a failed encoding conversion.
143143

144144
* ### Invalid HTML tags
145145

146-
Custom translation removes valid tags during training. Invalid tags cause unpredictable results and should be manually removed.
146+
custom translation removes valid tags during training. Invalid tags cause unpredictable results and should be manually removed.
147147

148148
## What steps should I take before uploading data?
149149

articles/ai-services/translator/custom-translator/azure-ai-foundry/concepts/bleu-score.md

Lines changed: 6 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "BLEU score - Custom translation"
2+
title: Azure AI Foundry custom translation BLEU score
33
titleSuffix: Azure AI services
44
description: The BLEU score measures the differences between machine translation and human-created reference translations of the same source sentence.
55
author: laujan
@@ -9,34 +9,22 @@ ms.topic: conceptual
99
ms.date: 05/19/2025
1010
ms.author: lajanuar
1111
ms.custom: cogserv-non-critical-translator
12-
#Customer intent: As an Custom translation user, I want to understand how BLEU score works so that I understand system test outcome better.
12+
#Customer intent: As an custom translation user, I want to understand how BLEU score works so that I understand system test outcome better.
1313
---
1414

15-
# BLEU score
15+
# Azure AI Foundry custom translation BLEU score
1616

1717
[BLEU (Bilingual Evaluation Understudy)](https://en.wikipedia.org/wiki/BLEU) is a measurement of the difference between an automatic translation and human-created reference translations of the same source sentence.
1818

1919
## Scoring process
2020

21-
The BLEU algorithm compares consecutive phrases of the automatic translation
22-
with the consecutive phrases it finds in the reference translation, and counts
23-
the number of matches, in a weighted fashion. These matches are position
24-
independent. A higher match degree indicates a higher degree of similarity with
25-
the reference translation, and higher score. Intelligibility and grammatical correctness aren't taken into account.
21+
The BLEU algorithm compares consecutive phrases of the automatic translation with the consecutive phrases it finds in the reference translation, and counts the number of matches, in a weighted fashion. These matches are position independent. A higher match degree indicates a higher degree of similarity with the reference translation, and higher score. Intelligibility and grammatical correctness aren't taken into account.
2622

2723
## How BLEU works?
2824

29-
The BLEU score's strength is that it correlates well with human judgment. BLEU averages out
30-
individual sentence judgment errors over a test corpus, rather than attempting
31-
to devise the exact human judgment for every sentence.
25+
The BLEU score's strength is that it correlates well with human judgment. BLEU averages out individual sentence judgment errors over a test corpus, rather than attempting to devise the exact human judgment for every sentence.
3226

33-
A more extensive discussion of BLEU scores is [here](https://youtu.be/-UqDljMymMg).
34-
35-
BLEU results depend strongly on the breadth of your domain; consistency of
36-
test, training and tuning data; and how much data you have
37-
available for training. If your models are trained within a narrow domain, and
38-
your training data is consistent with your test data, you can expect a high
39-
BLEU score.
27+
For a more extensive discussion of BLEU scores, *see* [Microsoft Translator Hub - Discussion of BLEU Score](https://youtu.be/-UqDljMymMg). BLEU results depend strongly on the breadth of your domain; consistency of test, training and tuning data; and how much data you have available for training. If your models are trained within a narrow domain, and your training data is consistent with your test data, you can expect a high BLEU score.
4028

4129
>[!NOTE]
4230
>A comparison between BLEU scores is only justifiable when BLEU results are compared with the same Test set, the same language pair, and the same MT engine. A BLEU score from a different test set is bound to be different.

articles/ai-services/translator/custom-translator/azure-ai-foundry/concepts/customization.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Translation Customization - Custom translation
2+
title: Azure AI Foundry custom translations
33
titleSuffix: Azure AI services
44
description: Build your own machine translation system using your preferred terminology and style.
55
#services: cognitive-services
@@ -11,9 +11,9 @@ ms.date: 05/19/2025
1111
ms.author: lajanuar
1212
---
1313

14-
# Customize your text translations
14+
# Azure AI Foundry custom translations
1515

16-
Custom translation is a feature of the Azure AI Translator service. Custom translation allows users to customize Azure AI Translator's advanced neural machine translation when translating text using Translator (version 3 only).
16+
Custom translation is a feature of the Azure AI Translator service. Azure AI Foundry custom translation allows users to customize Azure AI Translator's advanced neural machine translation when translating text using Translator (version 3 only).
1717

1818
The feature can also be used to customize speech translation when used with [Azure AI Speech](../../../../speech-service/index.yml).
1919

@@ -36,4 +36,4 @@ More details about the various levels of customization based on available data c
3636
## Next steps
3737

3838
> [!div class="nextstepaction"]
39-
> [Set up a customized language system using Custom translation](../overview.md)
39+
> [Set up a customized language system using custom translation](../overview.md)

articles/ai-services/translator/custom-translator/azure-ai-foundry/concepts/data-filtering.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,24 @@
11
---
2-
title: "Data Filtering - Custom translation"
2+
title: Azure AI Foundry custom translation data filtering
33
titleSuffix: Azure AI services
4-
description: Explaining how training documents for a custom system undergo a series of processing and filtering steps.
4+
description: How custom translation training documents for a custom system undergo a series of processing and filtering steps.
55
author: laujan
66
manager: nitinme
77
ms.service: azure-ai-translator
88
ms.date: 05/19/2025
99
ms.author: lajanuar
1010
ms.topic: conceptual
1111
ms.custom: cogserv-non-critical-translator
12-
#Customer intent: As a Custom translation, I want to understand how data is filtered before training a model.
12+
#Customer intent: As a custom translation, I want to understand how data is filtered before training a model.
1313
---
1414

15-
# Custom translation Data filtering
15+
# Azure AI Foundry custom translation data filtering
1616

17-
When you submit documents to be used for training, the documents undergo a series of processing and filtering steps. These steps are explained here. The knowledge of the filtering can help you understand the sentence count displayed in Custom translation and the steps you can take yourself to prepare the documents for training with Custom translation.
17+
When you submit documents to be used for training, the documents undergo a series of processing and filtering steps. These steps are explained here. The knowledge of the filtering can help you understand the sentence count displayed in custom translation and the steps you can take yourself to prepare the documents for training with custom translation.
1818

1919
## Sentence alignment
2020

21-
If your document isn't in XLIFF, `TMX`, or ALIGN format, Custom translation aligns the sentences of your source and target documents to each other, sentence by sentence. Custom translation doesn't perform document alignment – it follows your naming of the documents to find the matching document of the other language. Within the document, Custom translation tries to find the corresponding sentence in the other language. It uses document markup like embedded HTML tags to help with the alignment.
21+
If your document isn't in XLIFF, `TMX`, or ALIGN format, custom translation aligns the sentences of your source and target documents to each other, sentence by sentence. Custom translation doesn't perform document alignment – it follows your naming of the documents to find the matching document of the other language. Within the document, custom translation tries to find the corresponding sentence in the other language. It uses document markup like embedded HTML tags to help with the alignment.
2222

2323
If you see a large discrepancy between the number of sentences in the source and target documents, your documents can't be parallel. The document pairs with a large difference (>10%) of sentences on each side warrant a second look to make sure they're indeed parallel. Custom translation shows a warning next to the document if the sentence count differs suspiciously.
2424

articles/ai-services/translator/custom-translator/azure-ai-foundry/concepts/dictionaries.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: "Dictionary - Custom translation"
2+
title: Azure AI Foundry custom translation dictionary
33
titleSuffix: Azure AI services
4-
description: How to create an aligned document specifying a list of phrases or sentences (and their translations) that you always want Azure AI Translator to translate in the same manner. Dictionaries can also be called glossaries or term bases.
4+
description: How to create an Azure AI Foundry custom translation dictionary specifying a list of phrases or sentences (and their translations) that you want Azure AI Translator to always translate in the same manner. Dictionaries can also be called glossaries or term bases.
55
author: laujan
66
manager: nitinme
77
ms.service: azure-ai-translator
@@ -12,9 +12,9 @@ ms.custom: cogserv-non-critical-translator
1212
#Customer intent: As a Custom Translator, I want to understand how to use a dictionary to build a custom translation model.
1313
---
1414

15-
# Dictionary
15+
# Azure AI Foundry custom translation dictionary
1616

17-
A custom translation dictionary is an aligned pair of documents that specifies a list of phrases or sentences and their corresponding translations. Use a dictionary in your training, when you want Custom Translator to translate any instances of the source phrase or sentence, using the translation you provide in the dictionary. Dictionaries are sometimes called glossaries or term bases. You can think of the dictionary as a brute force "copy and replace" for all the terms you list. Furthermore, Custom translation service builds and makes use of its own general purpose dictionaries to improve the quality of its translation. However, a customer provided dictionary takes precedent and is searched first to look up words or sentences.
17+
A custom translation dictionary is an aligned pair of documents that specifies a list of phrases or sentences and their corresponding translations. Use a dictionary in your training, when you want Custom Translator to translate any instances of the source phrase or sentence, using the translation you provide in the dictionary. Dictionaries are sometimes called glossaries or term bases. You can think of the dictionary as a brute force "copy and replace" for all the terms you list. Furthermore, custom translation service builds and makes use of its own general purpose dictionaries to improve the quality of its translation. However, a customer provided dictionary takes precedent and is searched first to look up words or sentences.
1818

1919
Dictionaries only work for projects in language pairs that have a fully supported Microsoft general neural network model behind them. [View the complete list of languages](../../../language-support.md).
2020

articles/ai-services/translator/custom-translator/azure-ai-foundry/concepts/document-formats-naming-convention.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
---
2-
title: "Document formats and naming conventions - Custom translation"
2+
title: Azure AI Foundry custom translation document formats and naming conventions
33
titleSuffix: Azure AI services
4-
description: This article is a guide to document formats, naming conventions, and how to avoid naming conflicts for Custom translation.
4+
description: This article is a guide to document formats, naming conventions, and how to avoid naming conflicts for Azure AI Foundry custom translation.
55
author: laujan
66
manager: nitinme
77
ms.service: azure-ai-translator
88
ms.date: 05/19/2025
99
ms.author: lajanuar
1010
ms.topic: conceptual
1111
ms.custom: cogserv-non-critical-translator
12-
#Customer intent: As a Custom translation user, I want to understand how to format and name my documents.
12+
#Customer intent: As a custom translation user, I want to understand how to format and name my documents.
1313
---
1414

15-
# Custom translation document formats and naming convention
15+
# Azure AI Foundry custom translation document formats and naming conventions
1616

1717
Any file used for custom translation must be at least **four** characters in length.
1818

@@ -28,16 +28,16 @@ This table includes all supported file formats that you can use to build your tr
2828
| Adobe Acrobat | `.PDF` | Adobe Acrobat portable document |
2929
| `HTML` | `.HTML`, `.HTM` | HyperText Markup Language document |
3030
| Text file | `.TXT` | UTF-16 or UTF-8 encoded text files. The file name must not contain Japanese characters. |
31-
| Aligned text file | `.ALIGN` | The extension `.ALIGN` is a special extension that you can use if you know that the sentences in the document pair are perfectly aligned. If you provide a `.ALIGN` file, Custom translation doesn't align the sentences for you. |
31+
| Aligned text file | `.ALIGN` | The extension `.ALIGN` is a special extension that you can use if you know that the sentences in the document pair are perfectly aligned. If you provide a `.ALIGN` file, custom translation doesn't align the sentences for you. |
3232
| Excel file | `.XLSX` | Excel file (2013 or later). First line/ row of the spreadsheet should be language code. |
3333

3434
## Dictionary formats
3535

36-
For dictionaries, Custom translation supports all file formats that are supported for training sets. If you're using an Excel dictionary, the first line/ row of the spreadsheet should be language codes.
36+
For dictionaries, custom translation supports all file formats that are supported for training sets. If you're using an Excel dictionary, the first line/ row of the spreadsheet should be language codes.
3737

3838
## ZIP file formats
3939

40-
Documents can be grouped into a single zip file and uploaded. The Custom translation supports zip file formats (`ZIP`, `GZ`, and `TGZ`).
40+
Documents can be grouped into a single zip file and uploaded. The custom translation supports zip file formats (`ZIP`, `GZ`, and `TGZ`).
4141

4242
Each document in the zip file with the extension TXT, HTML, HTM, PDF, DOCX, ALIGN must follow this naming convention:
4343

@@ -52,4 +52,4 @@ Translation Memory files (`TMX`, `XLF`, `XLIFF`, `LCL`, `XLSX`) aren't required
5252
## Next steps
5353

5454
> [!div class="nextstepaction"]
55-
> [Learn about managing Custom translation projects](workspace-and-project.md#what-is-a-custom-translation-project)
55+
> [Learn about managing custom translation projects](workspace-and-project.md#what-is-an-azure-ai-foundry-custom-translation-project)

0 commit comments

Comments
 (0)