Skip to content

Commit c1f4db6

Browse files
authored
Merge pull request #49853 from lootle1/MR86
Technical Review 1037456: Measure and optimize model performance with…
2 parents 9772b2b + 14c4878 commit c1f4db6

13 files changed

+193
-185
lines changed
Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.introduction
3-
title: Introduction
4-
metadata:
5-
title: Introduction
6-
description: Introduction to the ROC AUC module.
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 2
12-
content: |
13-
[!include[](includes/1-introduction.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.introduction
3+
title: Introduction
4+
metadata:
5+
title: Introduction
6+
description: Introduction to the ROC AUC module.
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 2
12+
content: |
13+
[!include[](includes/1-introduction.md)]
Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.receiver-operator-characteristic-curve
3-
title: Analyze classification with receiver operator characteristic curves
4-
metadata:
5-
title: Analyze classification with receiver operator characteristic curves
6-
description: Conceptual unit introducing ROC curves in machine learning
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 4
12-
content: |
13-
[!include[](includes/2-receiver-operator-characteristic-curve.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.receiver-operator-characteristic-curve
3+
title: Analyze classification with receiver operator characteristic curves
4+
metadata:
5+
title: Analyze Classification with Receiver Operator Characteristic Curves
6+
description: Conceptual unit introducing ROC curves in machine learning
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 4
12+
content: |
13+
[!include[](includes/2-receiver-operator-characteristic-curve.md)]
Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.exercise-evaluate-roc-curves
3-
title: Exercise - Evaluate ROC curves
4-
metadata:
5-
title: Exercise - Evaluate ROC curves
6-
description: Exercise about good and bad ROC curves in machine learning
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 8
12-
sandbox: true
13-
notebook: notebooks/9-3-evaluate-roc-curves.ipynb
14-
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.exercise-evaluate-roc-curves
3+
title: Exercise - Evaluate ROC curves
4+
metadata:
5+
title: Exercise - Evaluate ROC Curves
6+
description: Exercise about good and bad ROC curves in machine learning
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 8
12+
sandbox: true
13+
notebook: notebooks/9-3-evaluate-roc-curves.ipynb
14+
Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.comparing-optimizing-curves
3-
title: Compare and optimize ROC curves
4-
metadata:
5-
title: Compare and optimize ROC curves
6-
description: Conceptual unit about comparing and optimizing machine learning models ROC curves
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 4
12-
content: |
13-
[!include[](includes/4-compare-optimize-curves.md)]
14-
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.comparing-optimizing-curves
3+
title: Compare and optimize ROC curves
4+
metadata:
5+
title: Compare and Optimize ROC Curves
6+
description: Conceptual unit about comparing and optimizing machine learning models ROC curves
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 4
12+
content: |
13+
[!include[](includes/4-compare-optimize-curves.md)]
14+
Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.exercise-tune-auc-curves
3-
title: Exercise - Tune the area under the curve
4-
metadata:
5-
title: Exercise - Tune the area under the curve
6-
description: Exercise unit about tuning under the curve in machine learning
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 12
12-
sandbox: true
13-
notebook: notebooks/9-5-tune-auc-curves.ipynb
14-
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.exercise-tune-auc-curves
3+
title: Exercise - Tune the area under the curve
4+
metadata:
5+
title: Exercise - Tune the Area Under the Curve
6+
description: Exercise unit about tuning under the curve in machine learning
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 12
12+
sandbox: true
13+
notebook: notebooks/9-5-tune-auc-curves.ipynb
14+
Lines changed: 48 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,48 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.knowledge-check
3-
title: Module assessment
4-
metadata:
5-
title: Module assessment
6-
description: Multiple-choice questions
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 3
12-
quiz:
13-
title: Check your knowledge
14-
questions:
15-
- content: 'What do TPR and FPR mean?'
16-
choices:
17-
- content: "TPR is the number of correct responses. FPR is the number of incorrect responses."
18-
isCorrect: false
19-
explanation: "Incorrect."
20-
- content: "TPR is the proportion of answers that were provided correctly as 'true'. FPR is the proportion of answers that were provided incorrectly as 'true'."
21-
isCorrect: true
22-
explanation: "Correct."
23-
- content: "TPR is the proportion of answers that were provided correctly as 'true'. FPR is the proportion of answers that were provided incorrectly as 'false'."
24-
isCorrect: false
25-
explanation: "Incorrect."
26-
- content: 'What are on the X and Y axes in an ROC plot?'
27-
choices:
28-
- content: 'X-axis: FP rate, Y-axis: TP rate'
29-
isCorrect: true
30-
explanation: "Correct."
31-
- content: "X-axis: Number of FPs, Y-axis: Number of TPs"
32-
isCorrect: false
33-
explanation: "Incorrect. "
34-
- content: "X-axis: Number of TPs, Y-axis: Number of FPs"
35-
isCorrect: false
36-
explanation: "Incorrect."
37-
- content: 'What does area under the curve for an ROC plot tell us?'
38-
choices:
39-
- content: "How well the model works at its optimum decision threshold"
40-
isCorrect: false
41-
explanation: "Incorrect. This information might be obtainable from the ROC plot but we can't get this information from the AUC."
42-
- content: "Which is the optimum decision threshold?"
43-
isCorrect: false
44-
explanation: "Incorrect. AUC is a summary metric that is too simplified to provide this information."
45-
- content: "It gives a summary of how well a model works across various thresholds."
46-
isCorrect: true
47-
explanation: "Correct."
48-
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.knowledge-check
3+
title: Module Assessment
4+
metadata:
5+
title: Module Assessment
6+
description: Multiple-choice questions
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 3
12+
quiz:
13+
title: Check your knowledge
14+
questions:
15+
- content: 'What do TPR and FPR mean?'
16+
choices:
17+
- content: "TPR is the number of correct responses. FPR is the number of incorrect responses."
18+
isCorrect: false
19+
explanation: "Incorrect."
20+
- content: "TPR is the proportion of answers that were provided correctly as 'true.' FPR is the proportion of answers that were provided incorrectly as 'true.'"
21+
isCorrect: true
22+
explanation: "Correct."
23+
- content: "TPR is the proportion of answers that were provided correctly as 'true.' FPR is the proportion of answers that were provided incorrectly as 'false.'"
24+
isCorrect: false
25+
explanation: "Incorrect."
26+
- content: 'What are on the X and Y axes in an ROC plot?'
27+
choices:
28+
- content: 'X-axis: FP rate, Y-axis: TP rate'
29+
isCorrect: true
30+
explanation: "Correct."
31+
- content: "X-axis: Number of FPs, Y-axis: Number of TPs"
32+
isCorrect: false
33+
explanation: "Incorrect. "
34+
- content: "X-axis: Number of TPs, Y-axis: Number of FPs"
35+
isCorrect: false
36+
explanation: "Incorrect."
37+
- content: 'What does area under the curve for an ROC plot tell us?'
38+
choices:
39+
- content: "How well the model works at its optimum decision threshold"
40+
isCorrect: false
41+
explanation: "Incorrect. This information might be obtainable from the ROC plot, but we can't get this information from the AUC."
42+
- content: "Which is the optimum decision threshold?"
43+
isCorrect: false
44+
explanation: "Incorrect. AUC is a summary metric that is too simplified to provide this information."
45+
- content: "It gives a summary of how well a model works across various thresholds."
46+
isCorrect: true
47+
explanation: "Correct."
48+
Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.summary
3-
title: Summary
4-
metadata:
5-
title: Summary
6-
description: An overview of the content covered in the module.
7-
ms.date: 07/20/2024
8-
author: s-polly
9-
ms.author: scottpolly
10-
ms.topic: unit
11-
durationInMinutes: 3
12-
content: |
13-
[!include[](includes/9-summary.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.machinelearning.optimize-model-performance-roc-auc-dropout.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: An overview of the content covered in the module.
7+
ms.date: 04/03/2025
8+
author: s-polly
9+
ms.author: scottpolly
10+
ms.topic: unit
11+
durationInMinutes: 3
12+
content: |
13+
[!include[](includes/9-summary.md)]

learn-pr/azure/optimize-model-performance-roc-auc/includes/1-introduction.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
We can assess our classification models in terms of the kinds of mistakes that they make, such as false negatives and false positives. This can give insight into the kinds of mistakes a model makes, but doesn't necessarily provide deep information on how the model could perform if slight adjustments were made to its decision criteria. Here, we'll discuss receiver operator characteristic (ROC) curves, which build on the idea of a confusion matrix but provide us with deeper information that lets us improve our models to a greater degree.
1+
We can assess our classification models in terms of the kinds of mistakes that they make, such as false negatives and false positives. This can give insight into the kinds of mistakes a model makes, but it doesn't necessarily provide deep information on how the model could perform if slight adjustments were made to its decision criteria. Here, we'll discuss receiver operator characteristic curves. ROC curves build on the idea of a confusion matrix but provide us with deeper information that lets us improve our models to a greater degree.
22

3-
## Scenario:
3+
## Scenario
44

5-
Throughout this module, well be using the following example scenario to explain and practice working with ROC curves.
5+
Throughout this module, we'll be using the following example scenario to explain and practice working with ROC curves.
66

77
Your avalanche-rescue charity has successfully built a machine learning model that can estimate whether an object detected by lightweight sensors is a hiker or a natural object, such as a tree or a rock. This lets you keep track of how many people are on the mountain, so you know whether a rescue team is needed when an avalanche strikes. The model does reasonably well, though you wonder if there's room for improvement. Internally, the model must make a binary decision as to whether an object is a hiker or not, but this is based on probabilities. Can this decision-making process be tweaked to improve its performance?
88

learn-pr/azure/optimize-model-performance-roc-auc/includes/2-receiver-operator-characteristic-curve.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ Classification models must assign a sample to a category. For example, it must u
22

33
We can improve classification models in many ways. For example, we can ensure our data are balanced, clean, and scaled. We can also alter our model architecture and use hyperparameters to squeeze as much performance as we possibly can out of our data and architecture. Eventually, we find no better way to improve performance on our test (or hold-out) set and declare our model ready.
44

5-
Model tuning to this point can be complex, but we can use a final simple step to further improve how well our model works. To understand this, though, we need to go back to basics.
5+
Model tuning to this point can be complex, but we can use a final step to further improve how well our model works. To understand this, though, we need to go back to basics.
66

77
## Probabilities and categories
88

@@ -23,17 +23,17 @@ We can calculate some handy characteristics from the confusion matrix. Two popul
2323

2424
Looking at true positive and false positive rates can help us understand a model's performance.
2525

26-
Consider our hiker example. Ideally, the true positive rate is very high, and the false positive rate is very low, because this means that the model identifies hikers well and doesn't identify trees as hikers very often. Yet, if the true positive rate is very high, but the false positive rate is also very high, then the model is biased; it's identifying almost everything it encounters as hiker. Similarly, we don't want a model with a low true positive rate, because then when the model encounters a hiker, it'll label them as a tree.
26+
Consider our hiker example. Ideally, the true positive rate is very high, and the false positive rate is very low. This means that the model identifies hikers well and doesn't identify trees as hikers very often. Yet, if the true positive rate is very high, but the false positive rate is also very high, then the model is biased; it's identifying almost everything it encounters as hiker. Similarly, we don't want a model with a low true positive rate, because then when the model encounters a hiker, it'll label them as a tree.
2727

2828
## ROC curves
2929

30-
Receiver operator characteristic (ROC) curves are a graph where we plot true positive rate versus false positive rate.
30+
Receiver operator characteristic curves are a graph where we plot true positive rate versus false positive rate.
3131

3232
ROC curves can be confusing for beginners for two main reasons. The first reason is that beginners know that a model only has one value for true positive and true negative rates, so an ROC plot must look like this:
3333

3434
![Receiver operator characteristic curve graph with one plot point.](../media/roc-graph.png)
3535

36-
If you're also thinking this, you're right. A trained model only produces one point. However, remember that our models have a thresholdnormally 50%that's used to decide whether the true (hiker) or false (tree) label should be used. If we change this threshold to 30% and recalculate true positive and false positive rates, we get another point:
36+
If you're also thinking this, you're right. A trained model only produces one point. However, remember that our models have a threshold-normally 50%-that's used to decide whether the true (hiker) or false (tree) label should be used. If we change this threshold to 30% and recalculate true positive and false positive rates, we get another point:
3737

3838
![Receiver operator characteristic curve graph with two plot points.](../media/roc-graph-2.png)
3939

0 commit comments

Comments
 (0)