Skip to content

Commit 03f0d1c

Browse files
authored
Merge pull request #177327 from magrefaat/patch-54
Update train-model.md
2 parents 6122318 + dfca87a commit 03f0d1c

File tree

1 file changed

+15
-17
lines changed
  • articles/cognitive-services/language-service/custom-classification/how-to

1 file changed

+15
-17
lines changed

articles/cognitive-services/language-service/custom-classification/how-to/train-model.md

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -28,23 +28,6 @@ Before you train your model you need:
2828

2929
See the [application development lifecycle](../overview.md#application-development-lifecycle) for more information.
3030

31-
## Training a model
32-
33-
The time to train a model varies on the dataset, and may take up to several hours. You can only train one model at a time, and you cannot create or train other models if one is already training in the same project.
34-
35-
36-
37-
As you train your model, keep in mind:
38-
39-
* [View the model's evaluation details](../how-to/view-model-evaluation.md) After model training, model evaluation is done against the [test set](../how-to/train-model.md#data-splits), which was not introduced to the model during training. By viewing the evaluation, you can get a sense of how the model performs in real-life scenarios.
40-
41-
* [Examine data distribution](../how-to/improve-model.md#examine-data-distribution-from-language-studio) Make sure that all classes are well represented and that you have a balanced data distribution to make sure that all your classes are adequately represented. If a certain class is tagged far less frequent than the others, this class is likely under-represented and most occurrences probably won't be recognized properly by the model at runtime. In this case, consider adding more files that belong to this class to your dataset.
42-
43-
* [Improve performance (optional)](../how-to/improve-model.md) Other than revising [tagged data](tag-data.md) based on error analysis, you may want to increase the number of tags for under-performing entity types, or improve the diversity of your tagged data. This will help your model learn to give correct predictions, over potential linguistic phenomena that cause failure.
44-
45-
<!-- * Define your own test set: If you are using a random split option and the resulting test set was not comprehensive enough, consider defining your own test to include a variety of data layouts and balanced tagged classes.
46-
-->
47-
4831
## Data splits
4932

5033
Before starting the training process, files in your dataset are divided into three groups at random:
@@ -65,6 +48,21 @@ Before starting the training process, files in your dataset are divided into thr
6548

6649
4. Select the **Train** button at the bottom of the page.
6750

51+
The time to train a model varies on the dataset, and may take up to several hours. You can only train one model at a time, and you cannot create or train other models if one is already training in the same project.
52+
53+
54+
After training has completed successfully, keep in mind:
55+
56+
* [View the model's evaluation details](../how-to/view-model-evaluation.md) After model training, model evaluation is done against the [test set](../how-to/train-model.md#data-splits), which was not introduced to the model during training. By viewing the evaluation, you can get a sense of how the model performs in real-life scenarios.
57+
58+
* [Examine data distribution](../how-to/improve-model.md#examine-data-distribution-from-language-studio) Make sure that all classes are well represented and that you have a balanced data distribution to make sure that all your classes are adequately represented. If a certain class is tagged far less frequent than the others, this class is likely under-represented and most occurrences probably won't be recognized properly by the model at runtime. In this case, consider adding more files that belong to this class to your dataset.
59+
60+
* [Improve performance (optional)](../how-to/improve-model.md) Other than revising [tagged data](tag-data.md) based on error analysis, you may want to increase the number of tags for under-performing entity types, or improve the diversity of your tagged data. This will help your model learn to give correct predictions, over potential linguistic phenomena that cause failure.
61+
62+
<!-- * Define your own test set: If you are using a random split option and the resulting test set was not comprehensive enough, consider defining your own test to include a variety of data layouts and balanced tagged classes.
63+
-->
64+
65+
6866
## Next steps
6967

7068
After training is completed, you will be able to [use the model evaluation metrics](../how-to/view-model-evaluation.md) to optionally [improve your model](../how-to/improve-model.md). Once you're satisfied with your model, you can deploy it, making it available to use for [classifying text](call-api.md).

0 commit comments

Comments
 (0)