Skip to content

Commit 57a4f5f

Browse files
authored
Merge pull request #189668 from laujan/patch-56
Update overview.md
2 parents 61367e9 + 981b8a8 commit 57a4f5f

File tree

2 files changed

+10
-8
lines changed

2 files changed

+10
-8
lines changed

articles/cognitive-services/Translator/custom-translator/overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: laujan
66
manager: nitinme
77
ms.service: cognitive-services
88
ms.subservice: translator-text
9-
ms.date: 12/09/2019
9+
ms.date: 02/25/2022
1010
ms.author: lajanuar
1111
ms.topic: overview
1212
#Customer intent: As a custom translator user, I want to understand what is Custom Translator, so that I can start using it.
@@ -17,7 +17,7 @@ Custom Translator is a feature of the Microsoft Translator service, which enable
1717

1818
Translation systems built with [Custom Translator](https://portal.customtranslator.azure.ai) are available through the same cloud-based, secure, high performance, highly scalable Microsoft Translator [Text API V3](../reference/v3-0-translate.md?tabs=curl), that powers billions of translations every day.
1919

20-
Custom Translator supports more than three dozen languages, and maps directly to the languages available for NMT. For a complete list, see [Microsoft Translator Languages](../language-support.md).
20+
The platform enables users to build and publish custom translation systems to and from English. Custom Translator supports more than three dozen languages that map directly to the languages available for NMT. For a complete list, *see* [Translator language support](../language-support.md).
2121

2222
This documentation contains the following article types:
2323

articles/cognitive-services/Translator/custom-translator/v2-preview/beginners-guide.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,21 @@ author: laujan
66
manager: nitinme
77
ms.service: cognitive-services
88
ms.subservice: translator-text
9-
ms.date: 01/20/2022
9+
ms.date: 02/25/2022
1010
ms.author: moelghaz
1111
ms.topic: overview
1212
---
1313
# Custom Translator for beginners | Preview
1414

15-
[Custom Translator](../overview.md) enables you to a build translation system that reflects your business, industry, and domain-specific terminology and style. Training and deploying a custom system is easy and does not require any programming skills. The customized translation system seamlessly integrates into your existing applications, workflows, and websites and is available on Azure through the same cloud-based [Microsoft Text Translator API](../../reference/v3-0-translate.md?tabs=curl) service that powers billions of translations every day.
15+
[Custom Translator](../overview.md) enables you to a build translation system that reflects your business, industry, and domain-specific terminology and style. Training and deploying a custom system is easy and doesn't require any programming skills. The customized translation system seamlessly integrates into your existing applications, workflows, and websites and is available on Azure through the same cloud-based [Microsoft Text Translator API](../../reference/v3-0-translate.md?tabs=curl) service that powers billions of translations every day.
16+
17+
The platform enables users to build and publish custom translation systems to and from English. The Custom Translator supports more than three dozen languages that map directly to the languages available for NMT. For a complete list, *see* [Translator language support](../../language-support.md).
1618

1719
## Is a custom translation model the right choice for me?
1820

1921
A well-trained custom translation model provides more accurate domain-specific translations. This is because it relies on previously translated in-domain documents to learn preferred translations. Translator uses these terms and phrases in context to produce fluent translations in the target language while respecting context-dependent grammar.
2022

21-
Training a full custom translation model requires a substantial amount of data. If you do not have at least 10,000 sentences of previously trained documents, you will not be able to train a full-language translation model. However, you can either train a dictionary-only model or use the high-quality, out-of-the-box translations available with the Text Translator API.
23+
Training a full custom translation model requires a substantial amount of data. If you don't have at least 10,000 sentences of previously trained documents, you won't be able to train a full-language translation model. However, you can either train a dictionary-only model or use the high-quality, out-of-the-box translations available with the Text Translator API.
2224

2325
:::image type="content" source="media/how-to/for-beginners.png" alt-text="Screenshot illustrating the difference between custom and general models.":::
2426

@@ -65,7 +67,7 @@ Finding in-domain quality data is often a challenging task that varies based on
6567
| Bilingual training documents | Teaches the system your terminology and style. | **Be liberal**. Any in-domain human translation is better than machine translation. Add and remove documents as you go and try to improve the [BLEU score](../what-is-bleu-score.md?WT.mc_id=aiml-43548-heboelma). |
6668
| Tuning documents | Trains the Neural Machine Translation parameters. | **Be strict**. Compose them to be optimally representative of what you are going to translation in the future. |
6769
| Test documents | Calculate the [BLEU score](../what-is-bleu-score.md?WT.mc_id=aiml-43548-heboelma).| **Be strict**. Compose test documents to be optimally representative of what you plan to translate in the future. |
68-
| Phrase dictionary | Forces the given translation 100% of the time. | **Be restrictive**. A phrase dictionary is case-sensitive and any word or phrase listed is translated in the way you specify. In many cases, it is better to not use a phrase dictionary and let the system learn. |
70+
| Phrase dictionary | Forces the given translation 100% of the time. | **Be restrictive**. A phrase dictionary is case-sensitive and any word or phrase listed is translated in the way you specify. In many cases, it's better to not use a phrase dictionary and let the system learn. |
6971
| Sentence dictionary | Forces the given translation 100% of the time. | **Be strict**. A sentence dictionary is case-insensitive and good for common in domain short sentences. For a sentence dictionary match to occur, the entire submitted sentence must match the source dictionary entry. If only a portion of the sentence matches, the entry won't match. |
7072

7173
## What is a BLEU score?
@@ -99,7 +101,7 @@ When you submit documents for training a custom translation system, the document
99101

100102
* ### Extracting tuning and testing data
101103

102-
Tuning and testing data is optional. If you don't provide it, the system will remove an appropriate percentage from your training documents to use for tuning and testing. The removal happens dynamically as part of the training process. Since this step occurs as part of training, your uploaded documents are not affected. You can see the final used sentence counts for each category of data—training, tuning, testing, and dictionary—on the Model details page after training has succeeded.
104+
Tuning and testing data is optional. If you don't provide it, the system will remove an appropriate percentage from your training documents to use for tuning and testing. The removal happens dynamically as part of the training process. Since this step occurs as part of training, your uploaded documents aren't affected. You can see the final used sentence counts for each category of data—training, tuning, testing, and dictionary—on the Model details page after training has succeeded.
103105

104106
* ### Length filter
105107

@@ -140,7 +142,7 @@ When you submit documents for training a custom translation system, the document
140142
* Remove sentences with invalid encoding.
141143
* Remove Unicode control characters.
142144
* If feasible, align sentences (source-to-target).
143-
* Remove source and target sentences that do not match the source and target languages.
145+
* Remove source and target sentences that don't match the source and target languages.
144146
* When source and target sentences have mixed languages, ensure that untranslated words are intentional, for example, names of organizations and products.
145147
* Correct grammatical and typographical errors to prevent teaching these errors to your model.
146148
* Though our training process handles source and target lines containing multiple sentences, it's better to have one source sentence mapped to one target sentence.

0 commit comments

Comments
 (0)