You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/computer-vision/includes/identity-sensitive-attributes.md
+1-2Lines changed: 1 addition & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,5 +11,4 @@ ms.author: pafarley
11
11
---
12
12
13
13
> [!CAUTION]
14
-
> Microsoft has retired facial recognition capabilities that can be used to try to infer emotional states and identity attributes which, if misused, can subject people to stereotyping, discrimination or unfair denial of services. These include capabilities that predict emotion, gender, age, smile, facial hair, hair and makeup. Read more about this decision [here](https://azure.microsoft.com/blog/responsible-ai-investments-and-safeguards-for-facial-recognition/).
15
-
14
+
> Microsoft has retired or limited facial recognition capabilities that can be used to try to infer emotional states and identity attributes which, if misused, can subject people to stereotyping, discrimination or unfair denial of services. The retired capabilities are emotion and gender. The limited capabilities are age, smile, facial hair, hair and makeup. Email [Azure Face API](mailto:[email protected]) if you have a responsible use case that would benefit from the use of any of the limited capabilities. Read more about this decision [here](https://azure.microsoft.com/blog/responsible-ai-investments-and-safeguards-for-facial-recognition/).
description: Learn about which how to make use of multilingual projects in conversational language understanding
4
+
description: Learn about how to make use of multilingual projects in conversational language understanding.
5
5
#services: cognitive-services
6
6
author: jboback
7
7
manager: nitinme
@@ -13,34 +13,34 @@ ms.author: jboback
13
13
14
14
# Multilingual projects
15
15
16
-
Conversational language understanding makes it easy for you to extend your project to several languages at once. When you enable multiple languages in projects, you'll be able to add languagespecific utterances and synonyms to your project, and get multilingual predictions for your intents and entities.
16
+
Conversational language understanding makes it easy for you to extend your project to several languages at once. When you enable multiple languages in projects, you can add language-specific utterances and synonyms to your project. You can get multilingual predictions for your intents and entities.
17
17
18
18
## Multilingual intent and learned entity components
19
19
20
-
When you enable multiple languages in a project, you can train the project primarily in one language and immediately get predictions in others.
20
+
When you enable multiple languages in a project, you can train the project primarily in one language and immediately get predictions in other languages.
21
21
22
-
For example, you can train your project entirely with English utterances, and query it in: French, German, Mandarin, Japanese, Korean, and others. Conversational language understanding makes it easy for you to scale your projects to multiple languages by using multilingual technology to train your models.
22
+
For example, you can train your project entirely with English utterances and query it in French, German, Mandarin, Japanese, Korean, and others. Conversational language understanding makes it easy for you to scale your projects to multiple languages by using multilingual technology to train your models.
23
23
24
-
Whenever you identify that a particular language is not performing as well as other languages, you can add utterances for that language in your project. In the [tag utterances](../how-to/tag-utterances.md) page in Language Studio, you can select the language of the utterance you're adding. When you introduce examples for that language to the model, it is introduced to more of the syntax of that language, and learns to predict it better.
24
+
Whenever you identify that a particular language isn't performing as well as other languages, you can add utterances for that language in your project. In the [tag utterances](../how-to/tag-utterances.md) page in Language Studio, you can select the language of the utterance you're adding. When you introduce examples for that language to the model, it's introduced to more of the syntax of that language and learns to predict it better.
25
25
26
-
You aren't expected to add the same amount of utterances for every language. You should build the majority of your project in one language, and only add a few utterances in languages you observe aren't performing well. If you create a project that is primarily in English, and start testing it in French, German, and Spanish, you might observe that German doesn't perform as well as the other two languages. In that case, consider adding 5% of your original English examples in German, train a new model and test in German again. You should see better results for German queries. The more utterances you add, the more likely the results are going to get better.
26
+
You aren't expected to add the same number of utterances for every language. You should build most of your project in one language and only add a few utterances in languages that you observe aren't performing well. If you create a project that's primarily in English and start testing it in French, German, and Spanish, you might observe that German doesn't perform as well as the other two languages. In that case, consider adding 5% of your original English examples in German, train a new model, and test in German again. You should see better results for German queries. The more utterances you add, the more likely the results are going to get better.
27
27
28
-
When you add data in another language, you shouldn't expect it to negatively affect other languages.
28
+
When you add data in another language, you shouldn't expect it to negatively affect other languages.
29
29
30
30
## List and prebuilt components in multiple languages
31
31
32
-
Projects with multiple languages enabled will allow you to specify synonyms **per language** for every list key. Depending on the language you query your project with, you will only get matches for the list component with synonyms of that language. When you query your project, you can specify the language in the request body:
32
+
Projects with multiple languages enabled allow you to specify synonyms *per language* for every list key. Depending on the language you query your project with, you only get matches for the list component with synonyms of that language. When you query your project, you can specify the language in the request body:
33
33
34
34
```json
35
35
"query": "{query}"
36
36
"language": "{language code}"
37
37
```
38
38
39
-
If you do not provide a language, it will fall back to the default language of your project. See the [language support](../language-support.md) article for a list of different language codes.
39
+
If you don't provide a language, it falls back to the default language of your project. For a list of different language codes, see [Language support](../language-support.md).
40
40
41
-
Prebuilt components are similar, where you should expect to get predictions for prebuilt components that are available in specific languages. The request's language again determines which components are attempting to be predicted. See the [prebuilt components](../prebuilt-component-reference.md) reference article for the language support of each prebuilt component.
41
+
Prebuilt components are similar, where you should expect to get predictions for prebuilt components that are available in specific languages. The request's language again determines which components are attempting to be predicted. For information on the language support of each prebuilt component, see the [Supported prebuilt entity components](../prebuilt-component-reference.md).
Copy file name to clipboardExpand all lines: articles/ai-services/language-service/conversational-language-understanding/includes/balance-training-data.md
+8-9Lines changed: 8 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,18 +10,17 @@ ms.author: jboback
10
10
11
11
## Balance training data
12
12
13
-
You should try to keep your schema well balanced when it comes to training data. Including large quantities of one intent, and very few of another will result in a model that is heavily biased towards particular intents.
13
+
When it comes to training data, try to keep your schema well balanced. Including large quantities of one intent and very few of another results in a model that's biased toward particular intents.
14
14
15
-
To address this, you can you may need to downsample your training set, or add to it. Downsampling can be done either by:
16
-
* Getting rid of a certain percentage of the training data randomly.
17
-
* In a more systematic manner by analyzing the dataset, and removing overrepresented duplicate entries.
15
+
To address this scenario, you might need to downsample your training set. Or you might need to add to it. To downsample, you can:
18
16
19
-
You can also add to the training set by selecting **Suggest Utterances** in **Data labeling** tab in Language studio. Conversational Language Understanding will send a call to [Azure OpenAI](../../../openai/overview.md) to generate similar utterances.
17
+
* Get rid of a certain percentage of the training data randomly.
18
+
* Analyze the dataset and remove overrepresented duplicate entries, which is a more systematic manner.
20
19
20
+
To add to the training set, in Language Studio, on the **Data labeling** tab, select **Suggest utterances**. Conversational Language Understanding sends a call to [Azure OpenAI](../../../openai/overview.md) to generate similar utterances.
21
21
22
-
:::image type="content" source="../media/suggest-utterances.png" alt-text="A screenshot showing utterance suggestion in Language Studio." lightbox="../media/suggest-utterances.png":::
22
+
:::image type="content" source="../media/suggest-utterances.png" alt-text="Screenshot that shows an utterance suggestion in Language Studio." lightbox="../media/suggest-utterances.png":::
23
23
24
-
You should also look for unintended "patterns" in the training set. For example, if the training set for a particular intent is all lowercase, or starts with a particular phrase. In such cases, the model you train might learn these unintended biases in the training set instead of being able to generalize.
25
-
26
-
We recommend introducing casing and punctuation diversity in the training set. If your model is expected to handle variations, be sure to have a training set that also reflects that diversity. For example, include some utterances in proper casing, and some in all lowercase.
24
+
You should also look for unintended "patterns" in the training set. For example, look to see if the training set for a particular intent is all lowercase or starts with a particular phrase. In such cases, the model you train might learn these unintended biases in the training set instead of being able to generalize.
27
25
26
+
We recommend that you introduce casing and punctuation diversity in the training set. If your model is expected to handle variations, be sure to have a training set that also reflects that diversity. For example, include some utterances in proper casing and some in all lowercase.
Copy file name to clipboardExpand all lines: articles/ai-services/language-service/conversational-language-understanding/includes/label-data-best-practices.md
+4-7Lines changed: 4 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,13 +10,10 @@ ms.author: jboback
10
10
11
11
## Clearly label utterances
12
12
13
-
* Ensure that the concepts that your entities refer to are well defined and separable. Check if you can easily determine the differences reliably. If you can't, this may be an indication that the learned component will also have difficulty.
13
+
* Ensure that the concepts that your entities refer to are well defined and separable. Check if you can easily determine the differences reliably. If you can't, this lack of distinction might indicate that the learned component will also have difficulty.
14
+
* If there's a similarity between entities, ensure that there's some aspect of your data that provides a signal for the difference between them.
14
15
15
-
* If there's a similarity between entities ensure that there's some aspect of your data that provides a signal for the difference between them.
16
-
17
-
For example, if you built a model to book flights, a user might use an utterance like "*I want a flight from Boston to Seattle.*" The *origin city* and *destination city* for such utterances would be expected to be similar. A signal to differentiate "*Origin city*" might be that it's often be preceded by the word "*from.*"
16
+
For example, if you built a model to book flights, a user might use an utterance like "I want a flight from Boston to Seattle." The *origin city* and *destination city* for such utterances would be expected to be similar. A signal to differentiate *origin city* might be that the word *from* often precedes it.
18
17
19
18
* Ensure that you label all instances of each entity in both your training and testing data. One approach is to use the search function to find all instances of a word or phrase in your data to check if they're correctly labeled.
20
-
21
-
* Label test data for entities that have no [learned component](../concepts/entity-components.md#learned-component) and also for those that do. This will help ensure that your evaluation metrics are accurate.
22
-
19
+
* Label test data for entities that have no [learned component](../concepts/entity-components.md#learned-component) and also for the entities that do. This practice helps to ensure that your evaluation metrics are accurate.
0 commit comments