You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/LUIS/luis-concept-best-practices.md
+32-2Lines changed: 32 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: Best practices for building your LUIS app
3
3
description: Learn the best practices to get the best results from your LUIS app's model.
4
4
ms.topic: conceptual
5
-
ms.date: 05/06/2020
5
+
ms.date: 05/17/2020
6
6
ms.author: diberry
7
7
---
8
8
# Best practices for building a language understanding (LUIS) app
@@ -25,14 +25,28 @@ The following list includes best practices for LUIS apps:
25
25
26
26
|Do|Don't|
27
27
|--|--|
28
-
|[Define distinct intents](#do-define-distinct-intents)<br>[Add features to intents](#do-add-features-to-intents)|[Add many example utterances to intents](#dont-add-many-example-utterances-to-intents)<br>[Use few or simple entities](#dont-use-few-or-simple-entities)|
28
+
|[Plan your schema](#do-plan-your-schema)|[Build and publish without a plan](#dont-publish-too-quickly)|
29
+
|[Define distinct intents](#do-define-distinct-intents)<br>[Add features to intents](#do-add-features-to-intents)<br>
30
+
[Use machine learned entities](#do-use-machine-learned-entities) |[Add many example utterances to intents](#dont-add-many-example-utterances-to-intents)<br>[Use few or simple entities](#dont-use-few-or-simple-entities) |
29
31
|[Find a sweet spot between too generic and too specific for each intent](#do-find-sweet-spot-for-intents)|[Use LUIS as a training platform](#dont-use-luis-as-a-training-platform)|
30
32
|[Build your app iteratively with versions](#do-build-your-app-iteratively-with-versions)<br>[Build entities for model decomposition](#do-build-for-model-decomposition)|[Add many example utterances of the same format, ignoring other formats](#dont-add-many-example-utterances-of-the-same-format-ignoring-other-formats)|
31
33
|[Add patterns in later iterations](#do-add-patterns-in-later-iterations)|[Mix the definition of intents and entities](#dont-mix-the-definition-of-intents-and-entities)|
32
34
|[Balance your utterances across all intents](#balance-your-utterances-across-all-intents) except the None intent.<br>[Add example utterances to None intent](#do-add-example-utterances-to-none-intent)|[Create phrase lists with all possible values](#dont-create-phrase-lists-with-all-the-possible-values)|
33
35
|[Leverage the suggest feature for active learning](#do-leverage-the-suggest-feature-for-active-learning)|[Add too many patterns](#dont-add-many-patterns)|
34
36
|[Monitor the performance of your app with batch testing](#do-monitor-the-performance-of-your-app)|[Train and publish with every single example utterance added](#dont-train-and-publish-with-every-single-example-utterance)|
35
37
38
+
## Do plan your schema
39
+
40
+
Before you start building your app's schema, you should identify what and where you plan to use this app. The more thorough and specific your planning, the better your app becomes.
41
+
42
+
* Research targeted users
43
+
* Defining end-to-end personas to represent your app - voice, avatar, issue handling (proactive, reactive)
44
+
* Identify user interactions (text, speech) through which channels, handing off to existing solutions or creating a new solution for this app
45
+
* End-to-end user journey
46
+
* What you should expect this app to do and not do? * What are the priorities of what it should do?
47
+
* What are the main use cases?
48
+
* Collecting data - [learn](data-collection.md) about collecting and preparing data
49
+
36
50
## Do define distinct intents
37
51
Make sure the vocabulary for each intent is just for that intent and not overlapping with a different intent. For example, if you want to have an app that handles travel arrangements such as airline flights and hotels, you can choose to have these subject areas as separate intents or the same intent with entities for specific data inside the utterance.
38
52
@@ -54,6 +68,14 @@ Features describe concepts for an intent. A feature can be a phrase list of word
54
68
## Do find sweet spot for intents
55
69
Use prediction data from LUIS to determine if your intents are overlapping. Overlapping intents confuse LUIS. The result is that the top scoring intent is too close to another intent. Because LUIS does not use the exact same path through the data for training each time, an overlapping intent has a chance of being first or second in training. You want the utterance's score for each intention to be farther apart so this flip/flop doesn't happen. Good distinction for intents should result in the expected top intent every time.
56
70
71
+
## Do use machine learned entities
72
+
73
+
Machine learned entities are tailored to your app and require labeling to be successful. If you are not using machine learned entities, you might be using the wrong tool.
74
+
75
+
Machine learned entities can use other entities as features. These other entities can be custom entities such as regular expression entities or list entities, or you can use prebuilt entities as features.
76
+
77
+
Learn about [effective machine learned entities](luis-concept-entity-types.md#effective-machine-learned-entities).
78
+
57
79
<aname="#do-build-the-app-iteratively"></a>
58
80
59
81
## Do build your app iteratively with versions
@@ -116,6 +138,14 @@ Monitor the prediction accuracy using a [batch test](luis-concept-batch-test.md)
116
138
117
139
Keep a separate set of utterances that aren't used as [example utterances](luis-concept-utterance.md) or endpoint utterances. Keep improving the app for your test set. Adapt the test set to reflect real user utterances. Use this test set to evaluate each iteration or version of the app.
118
140
141
+
## Don't publish too quickly
142
+
143
+
Publishing your app too quickly, without [proper planning](#do-plan-your-schema), may lead to several issues such as:
144
+
145
+
* Your app will not work in your actual scenario at an acceptable level of performance.
146
+
* The schema (intents and entities) would not be appropriate, and if you have developed client app logic following the schema, you may need to rewrite that from scratch. This would cause unexpected delays and an extra cost to the project you are working on.
147
+
* Utterances you add to the model might cause bias towards the example utterance set that is hard to debug and identify. It will also make removing ambiguity difficult after you have committed to a certain schema.
148
+
119
149
## Don't add many example utterances to intents
120
150
121
151
After the app is published, only add utterances from active learning in the development lifecycle process. If utterances are too similar, add a pattern.
Copy file name to clipboardExpand all lines: articles/cognitive-services/LUIS/luis-concept-entity-types.md
+19-2Lines changed: 19 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: Entity types - LUIS
3
3
description: An entity extracts data from a user utterance at prediction runtime. An _optional_, secondary purpose is to boost the prediction of the intent or other entities by using the entity as a feature.
4
4
ms.topic: conceptual
5
-
ms.date: 04/30/2020
5
+
ms.date: 05/17/2020
6
6
---
7
7
8
8
# Extract data with entities
@@ -11,7 +11,7 @@ An entity extracts data from a user utterance at prediction runtime. An _optiona
*[Machine-learned entity](reference-entity-machine-learned-entity.md) - this is the primary entity. You should design your schema with this entity type before using other entities.
15
15
* Non-machine-learned used as a required [feature](luis-concept-feature.md) - for exact text matches, pattern matches, or detection by prebuilt entities
16
16
*[Pattern.any](#patternany-entity) - to extract free-form text such as book titles from a [Pattern](reference-entity-pattern-any.md)
17
17
@@ -59,6 +59,14 @@ A machine-learned entity triggers based on the context learned through example u
59
59
60
60
[**Machine-learned entities**](tutorial-machine-learned-entity.md) are the top-level extractors. Subentities are child entities of machine-learned entities.
61
61
62
+
## Effective machine learned entities
63
+
64
+
To build the machine learned entities effectively:
65
+
66
+
* Your labeling should be consistent across the intents. This includes even utterances you provide in the **None** intent that include this entity. Otherwise the model will not be able to determine the sequences effectively.
67
+
* If you have a machine learned entity with subentities, make sure that the different orders and variants of the entity and subentities are presented in the labeled utterances. Labeled example utterances should include all valid forms, and include entities that appear and are absent and also reordered within the utterance.
68
+
* You should avoid overfitting the entities to a very fixed set. **Overfitting** happens when the model doesn't generalize well, and is a common problem in machine learning models. This implies the app would not work on new data adequately. In turn, you should vary the labeled example utterances so the app is able to generalize beyond the limited examples you provide. You should vary the different subentities with enough change for the model to think more of the concept instead of just the examples shown.
69
+
62
70
<aname="composite-entity"></a>
63
71
<aname="list-entity"></a>
64
72
<aname="patternany-entity"></a>
@@ -80,6 +88,15 @@ Choose the entity based on how the data should be extracted and how it should be
80
88
|[**Prebuilt**](luis-reference-prebuilt-entities.md)|Already trained to extract specific kind of data such as URL or email. Some of these prebuilt entities are defined in the open-source [Recognizers-Text](https://github.com/Microsoft/Recognizers-Text) project. If your specific culture or entity isn't currently supported, contribute to the project.|
81
89
|[**Regular Expression**](reference-entity-regular-expression.md)|Uses regular expression for **exact text match**.|
82
90
91
+
92
+
## Extraction versus resolution
93
+
94
+
Entities extract data as the data appears in the utterance. Entities do not change or resolve the data. The entity won't provide any resolution if the text is a valid value for the entity or not.
95
+
96
+
There are ways to bring resolution into the extraction, but you should be aware that this limits the ability of the app to be immune against variations and mistakes.
97
+
98
+
List entities and regular expression (text-matching) entities can be used as [required features](luis-concept-feature.md#required-features) to a subentity and that acts as a filter to the extraction. You should use this carefully as not to hinder the ability of the app to predict.
99
+
83
100
## Extracting contextually related data
84
101
85
102
An utterance may contain two or more occurrences of an entity where the meaning of the data is based on context within the utterance. An example is an utterance for booking a flight that has two geographical locations, origin and destination.
Copy file name to clipboardExpand all lines: articles/cognitive-services/LUIS/luis-concept-feature.md
+13-1Lines changed: 13 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: Features - LUIS
3
3
description: Add features to a language model to provide hints about how to recognize input that you want to label or classify.
4
4
ms.topic: conceptual
5
-
ms.date: 04/23/2020
5
+
ms.date: 05/14/2020
6
6
---
7
7
# Machine-learning (ML) features
8
8
@@ -82,10 +82,22 @@ For example, if n shipping address entity contained a street address subentity,
82
82
* Country (subentity)
83
83
* Postal code (subentity)
84
84
85
+
## Nested subentities with features
86
+
87
+
A machine learned subentity indicates a concept is present to the parent entity, whether that parent is another subentity or the top entity. The value of the subentity acts as a feature to its parent.
88
+
89
+
A subentity can have both a phrase list as a feature as well as a model (another entity) as a feature.
90
+
91
+
When the subentity has a phrase list, this will boost the vocabulary of the concept but won't add any information to the JSON response of the prediction.
92
+
93
+
When the subentity has a feature of another entity, the JSON response includes the extracted data of that other entity.
94
+
85
95
## Required features
86
96
87
97
A required feature has to be found in order for the model to be returned from the prediction endpoint. Use a required feature when you know your incoming data must match the feature.
88
98
99
+
If the utterance text doesn't match the required feature, it will not be extracted.
100
+
89
101
**A required feature uses a non-machine learned entity**:
Copy file name to clipboardExpand all lines: articles/cognitive-services/LUIS/luis-glossary.md
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -195,6 +195,10 @@ A (machine learned) model is a function that makes a prediction on input data. I
195
195
196
196
You add values to your [list](#list-entity) entities. Each of those values can have a list of one or more synonyms. Only the normalized value is returned in the response.
197
197
198
+
## Overfitting
199
+
200
+
Overfitting happens when the model is fixated on the specific examples and is not able to generalize well.
201
+
198
202
## Owner
199
203
200
204
Each app has one owner who is the person that created the app. The owner manages permissions to the application in the Azure portal.
@@ -255,7 +259,7 @@ LUIS quota is the limitation of the Azure subscription tier. The LUIS quota can
255
259
256
260
## Schema
257
261
258
-
Your schema includes your intents and entities along with the subentities. The schema is initially planned for then iterated over time. The schema doesn't include app settings, features, or example utterances.
262
+
Your schema includes your intents and entities along with the subentities. The schema is initially planned for then iterated over time. The schema doesn't include app settings, features, or example utterances.
259
263
260
264
## Sentiment Analysis
261
265
Sentiment analysis provides positive or negative values of the utterances provided by [Text Analytics](../text-analytics/overview.md).
Copy file name to clipboardExpand all lines: articles/cognitive-services/LUIS/luis-how-plan-your-app.md
+25-1Lines changed: 25 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: Plan your app - LUIS
3
3
description: Outline relevant app intents and entities, and then create your application plans in Language Understanding Intelligent Services (LUIS).
4
4
ms.topic: conceptual
5
-
ms.date: 04/14/2020
5
+
ms.date: 05/14/2020
6
6
---
7
7
8
8
# Plan your LUIS app schema with subject domain and data extraction
@@ -44,6 +44,30 @@ When determining which entities to use in your app, keep in mind that there are
44
44
> [!TIP]
45
45
> LUIS offers [prebuilt entities](luis-prebuilt-entities.md) for common, conversational user scenarios. Consider using prebuilt entities as a starting point for your application development.
46
46
47
+
## Resolution with intent or entity?
48
+
49
+
In many cases, especially when working with natural conversation, users provide an utterance that can contain more than one function or intent. To address this, a general rule of thumb is to understand that the representation of the output can be done in both intents and entities. This representation should be mappable to your client application actions, and it doesn't need to be limited to the intents.
50
+
51
+
**Int-ent-ties** is the concept that actions (usually understood as intents) could also be captured as entities and relied on in this form in the output JSON where you can map it to a specific action. _Negation_ is a common usage to leverage this reliance on both intent and entity for full extraction.
52
+
53
+
Consider the following two utterances which are very close considering word choice but have different results:
54
+
55
+
|Utterance|
56
+
|--|
57
+
|`Please schedule my flight from Cairo to Seattle`|
58
+
|`Cancel my flight from Cairo to Seattle`|
59
+
60
+
Instead of having two separate intents, create a single intent with a `FlightAction` machine learning entity. The machine learning entity should extract the details of the action for both a scheduling and a cancelling request as well as either a origin or destination location.
61
+
62
+
The `FlightAction` entity would be structured in the following suedo-schema of machine learning entity and subentities:
63
+
64
+
* FlightAction
65
+
* Action
66
+
* Origin
67
+
* Destination
68
+
69
+
To help the extraction add features to the subentities. You will choose your features based on the vocabulary you expect to see in user utterances and the values you want returned in the prediction response.
0 commit comments