You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Translator/custom-translator/overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ author: laujan
6
6
manager: nitinme
7
7
ms.service: cognitive-services
8
8
ms.subservice: translator-text
9
-
ms.date: 12/09/2019
9
+
ms.date: 02/25/2022
10
10
ms.author: lajanuar
11
11
ms.topic: overview
12
12
#Customer intent: As a custom translator user, I want to understand what is Custom Translator, so that I can start using it.
@@ -17,7 +17,7 @@ Custom Translator is a feature of the Microsoft Translator service, which enable
17
17
18
18
Translation systems built with [Custom Translator](https://portal.customtranslator.azure.ai) are available through the same cloud-based, secure, high performance, highly scalable Microsoft Translator [Text API V3](../reference/v3-0-translate.md?tabs=curl), that powers billions of translations every day.
19
19
20
-
Custom Translator supports more than three dozen languages, and maps directly to the languages available for NMT. For a complete list, see[Microsoft Translator Languages](../language-support.md).
20
+
The platform enables users to build and publish custom translation systems to and from English. Custom Translator supports more than three dozen languages that map directly to the languages available for NMT. For a complete list, *see*[Translator language support](../language-support.md).
21
21
22
22
This documentation contains the following article types:
Copy file name to clipboardExpand all lines: articles/cognitive-services/Translator/custom-translator/v2-preview/beginners-guide.md
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,19 +6,21 @@ author: laujan
6
6
manager: nitinme
7
7
ms.service: cognitive-services
8
8
ms.subservice: translator-text
9
-
ms.date: 01/20/2022
9
+
ms.date: 02/25/2022
10
10
ms.author: moelghaz
11
11
ms.topic: overview
12
12
---
13
13
# Custom Translator for beginners | Preview
14
14
15
-
[Custom Translator](../overview.md) enables you to a build translation system that reflects your business, industry, and domain-specific terminology and style. Training and deploying a custom system is easy and does not require any programming skills. The customized translation system seamlessly integrates into your existing applications, workflows, and websites and is available on Azure through the same cloud-based [Microsoft Text Translator API](../../reference/v3-0-translate.md?tabs=curl) service that powers billions of translations every day.
15
+
[Custom Translator](../overview.md) enables you to a build translation system that reflects your business, industry, and domain-specific terminology and style. Training and deploying a custom system is easy and doesn't require any programming skills. The customized translation system seamlessly integrates into your existing applications, workflows, and websites and is available on Azure through the same cloud-based [Microsoft Text Translator API](../../reference/v3-0-translate.md?tabs=curl) service that powers billions of translations every day.
16
+
17
+
The platform enables users to build and publish custom translation systems to and from English. The Custom Translator supports more than three dozen languages that map directly to the languages available for NMT. For a complete list, *see*[Translator language support](../../language-support.md).
16
18
17
19
## Is a custom translation model the right choice for me?
18
20
19
21
A well-trained custom translation model provides more accurate domain-specific translations. This is because it relies on previously translated in-domain documents to learn preferred translations. Translator uses these terms and phrases in context to produce fluent translations in the target language while respecting context-dependent grammar.
20
22
21
-
Training a full custom translation model requires a substantial amount of data. If you do not have at least 10,000 sentences of previously trained documents, you will not be able to train a full-language translation model. However, you can either train a dictionary-only model or use the high-quality, out-of-the-box translations available with the Text Translator API.
23
+
Training a full custom translation model requires a substantial amount of data. If you don't have at least 10,000 sentences of previously trained documents, you won't be able to train a full-language translation model. However, you can either train a dictionary-only model or use the high-quality, out-of-the-box translations available with the Text Translator API.
22
24
23
25
:::image type="content" source="media/how-to/for-beginners.png" alt-text="Screenshot illustrating the difference between custom and general models.":::
24
26
@@ -65,7 +67,7 @@ Finding in-domain quality data is often a challenging task that varies based on
65
67
| Bilingual training documents | Teaches the system your terminology and style. |**Be liberal**. Any in-domain human translation is better than machine translation. Add and remove documents as you go and try to improve the [BLEU score](../what-is-bleu-score.md?WT.mc_id=aiml-43548-heboelma). |
66
68
| Tuning documents | Trains the Neural Machine Translation parameters. |**Be strict**. Compose them to be optimally representative of what you are going to translation in the future. |
67
69
| Test documents | Calculate the [BLEU score](../what-is-bleu-score.md?WT.mc_id=aiml-43548-heboelma).|**Be strict**. Compose test documents to be optimally representative of what you plan to translate in the future. |
68
-
| Phrase dictionary | Forces the given translation 100% of the time. |**Be restrictive**. A phrase dictionary is case-sensitive and any word or phrase listed is translated in the way you specify. In many cases, it is better to not use a phrase dictionary and let the system learn. |
70
+
| Phrase dictionary | Forces the given translation 100% of the time. |**Be restrictive**. A phrase dictionary is case-sensitive and any word or phrase listed is translated in the way you specify. In many cases, it's better to not use a phrase dictionary and let the system learn. |
69
71
| Sentence dictionary | Forces the given translation 100% of the time. |**Be strict**. A sentence dictionary is case-insensitive and good for common in domain short sentences. For a sentence dictionary match to occur, the entire submitted sentence must match the source dictionary entry. If only a portion of the sentence matches, the entry won't match. |
70
72
71
73
## What is a BLEU score?
@@ -99,7 +101,7 @@ When you submit documents for training a custom translation system, the document
99
101
100
102
*### Extracting tuning and testing data
101
103
102
-
Tuning and testing data is optional. If you don't provide it, the system will remove an appropriate percentage from your training documents to use for tuning and testing. The removal happens dynamically as part of the training process. Since this step occurs as part of training, your uploaded documents are not affected. You can see the final used sentence counts for each category of data—training, tuning, testing, and dictionary—on the Model details page after training has succeeded.
104
+
Tuning and testing data is optional. If you don't provide it, the system will remove an appropriate percentage from your training documents to use for tuning and testing. The removal happens dynamically as part of the training process. Since this step occurs as part of training, your uploaded documents aren't affected. You can see the final used sentence counts for each category of data—training, tuning, testing, and dictionary—on the Model details page after training has succeeded.
103
105
104
106
*### Length filter
105
107
@@ -140,7 +142,7 @@ When you submit documents for training a custom translation system, the document
140
142
* Remove sentences with invalid encoding.
141
143
* Remove Unicode control characters.
142
144
* If feasible, align sentences (source-to-target).
143
-
* Remove source and target sentences that do not match the source and target languages.
145
+
* Remove source and target sentences that don't match the source and target languages.
144
146
* When source and target sentences have mixed languages, ensure that untranslated words are intentional, for example, names of organizations and products.
145
147
* Correct grammatical and typographical errors to prevent teaching these errors to your model.
146
148
* Though our training process handles source and target lines containing multiple sentences, it's better to have one source sentence mapped to one target sentence.
0 commit comments