Line edits

lootle1 · lootle1 · commit 1bc75ba82943 · 2025-04-03T12:05:49.000-05:00
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/8-knowledge-check.yml b/learn-pr/azure/optimize-model-performance-roc-auc/8-knowledge-check.yml
@@ -38,7 +38,7 @@ quiz:
     choices:
     - content: "How well the model works at its optimum decision threshold"
       isCorrect: false
-      explanation: "Incorrect. This information might be obtainable from the ROC plot but we can't get this information from the AUC."
+      explanation: "Incorrect. This information might be obtainable from the ROC plot, but we can't get this information from the AUC."
     - content: "Which is the optimum decision threshold?"
       isCorrect: false
       explanation: "Incorrect. AUC is a summary metric that is too simplified to provide this information."
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/includes/1-introduction.md b/learn-pr/azure/optimize-model-performance-roc-auc/includes/1-introduction.md
@@ -1,8 +1,8 @@
-We can assess our classification models in terms of the kinds of mistakes that they make, such as false negatives and false positives. This can give insight into the kinds of mistakes a model makes, but doesn't necessarily provide deep information on how the model could perform if slight adjustments were made to its decision criteria. Here, we'll discuss receiver operator characteristic (ROC) curves, which build on the idea of a confusion matrix but provide us with deeper information that lets us improve our models to a greater degree.
+We can assess our classification models in terms of the kinds of mistakes that they make, such as false negatives and false positives. This can give insight into the kinds of mistakes a model makes, but it doesn't necessarily provide deep information on how the model could perform if slight adjustments were made to its decision criteria. Here, we'll discuss receiver operator characteristic curves. ROC curves build on the idea of a confusion matrix but provide us with deeper information that lets us improve our models to a greater degree.
 
-## Scenario:
+## Scenario
 
-Throughout this module, we’ll be using the following example scenario to explain and practice working with ROC curves.
+Throughout this module, we'll be using the following example scenario to explain and practice working with ROC curves.
 
 Your avalanche-rescue charity has successfully built a machine learning model that can estimate whether an object detected by lightweight sensors is a hiker or a natural object, such as a tree or a rock. This lets you keep track of how many people are on the mountain, so you know whether a rescue team is needed when an avalanche strikes. The model does reasonably well, though you wonder if there's room for improvement. Internally, the model must make a binary decision as to whether an object is a hiker or not, but this is based on probabilities. Can this decision-making process be tweaked to improve its performance?
 
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/includes/2-receiver-operator-characteristic-curve.md b/learn-pr/azure/optimize-model-performance-roc-auc/includes/2-receiver-operator-characteristic-curve.md
@@ -23,17 +23,17 @@ We can calculate some handy characteristics from the confusion matrix. Two popul
 
 Looking at true positive and false positive rates can help us understand a model's performance.
 
-Consider our hiker example. Ideally, the true positive rate is very high, and the false positive rate is very low, because this means that the model identifies hikers well and doesn't identify trees as hikers very often. Yet, if the true positive rate is very high, but the false positive rate is also very high, then the model is biased; it's identifying almost everything it encounters as hiker. Similarly, we don't want a model with a low true positive rate, because then when the model encounters a hiker, it'll label them as a tree.
+Consider our hiker example. Ideally, the true positive rate is very high, and the false positive rate is very low. This means that the model identifies hikers well and doesn't identify trees as hikers very often. Yet, if the true positive rate is very high, but the false positive rate is also very high, then the model is biased; it's identifying almost everything it encounters as hiker. Similarly, we don't want a model with a low true positive rate, because then when the model encounters a hiker, it'll label them as a tree.
 
 ## ROC curves
 
-Receiver operator characteristic (ROC) curves are a graph where we plot true positive rate versus false positive rate.
+Receiver operator characteristic curves are a graph where we plot true positive rate versus false positive rate.
 
 ROC curves can be confusing for beginners for two main reasons. The first reason is that beginners know that a model only has one value for true positive and true negative rates, so an ROC plot must look like this:
 
 ![Receiver operator characteristic curve graph with one plot point.](../media/roc-graph.png)
 
-If you're also thinking this, you're right. A trained model only produces one point. However, remember that our models have a threshold—normally 50%—that's used to decide whether the true (hiker) or false (tree) label should be used. If we change this threshold to 30% and recalculate true positive and false positive rates, we get another point:
+If you're also thinking this, you're right. A trained model only produces one point. However, remember that our models have a threshold-normally 50%-that's used to decide whether the true (hiker) or false (tree) label should be used. If we change this threshold to 30% and recalculate true positive and false positive rates, we get another point:
 
 ![Receiver operator characteristic curve graph with two plot points.](../media/roc-graph-2.png)
 
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/includes/4-compare-optimize-curves.md b/learn-pr/azure/optimize-model-performance-roc-auc/includes/4-compare-optimize-curves.md
@@ -1,20 +1,20 @@
-Receiver operator characteristic (ROC) curves let us compare models to one another and tune our selected model. Let's discuss how and why these are done.
+ROC curves let us compare models to one another and tune our selected model. Let's discuss how and why these are done.
 
 ## Tuning a model
 
 The most obvious use for an ROC curve is to choose a decision threshold that gives the best performance. Recall that our models provide us with probabilities, such as a 65% chance that the sample is a hiker. The decision threshold is the point above which a sample is assigned true (hiker) or below which it's assigned `false` (tree). If our decision threshold was 50%, then 65% would be assigned to "true" (hiker). If our decision threshold was 70%, however, a probability of 65% would be too small, and be assigned to "false" (tree).
 
 We've seen in the previous exercise that when we construct an ROC curve, we're just changing the decision threshold and assessing how well the model works. When we do this, we can find the threshold that gives the optimal results.
 
-Usually there isn't a single threshold that gives both the best true positive rate (TPR) and the lower false positive rate (FPR). This means that the optimal threshold depends on what you're trying to achieve. For example, in our scenario, it's very important to have a high true positive rate, because if a hiker isn't identified and an avalanche occurs, the team won't know to rescue them. There's a trade-off, though: if the false positive rate is too high, then the rescue team may repeatedly be sent out to rescue people who simply don't exist. In other situations, the false positive rate is considered more important. For example, science has a low tolerance for false-positive results. If the false-positive rate of scientific experiments was higher, there would be an endless flurry of contradictory claims, and it would be impossible to make sense of what's real.
+Usually there isn't a single threshold that gives both the best true positive rate (TPR) and the lower false positive rate (FPR). This means that the optimal threshold depends on what you're trying to achieve. For example, in our scenario, it's very important to have a high true positive rate. This is because if a hiker isn't identified and an avalanche occurs, the team won't know to rescue them. There's a trade-off, though: if the false positive rate is too high, then the rescue team may repeatedly be sent out to rescue people who simply don't exist. In other situations, the false positive rate is considered more important. For example, science has a low tolerance for false-positive results. If the false-positive rate of scientific experiments was higher, there would be an endless flurry of contradictory claims, and it would be impossible to make sense of what's real.
 
 ## Comparing models with AUC
 
 You can use ROC curves to compare models to each other, just like you can with cost functions. An ROC curve for a model shows how well it will work for a variety of decision thresholds. At the end of the day, what's most important in a model is how it will perform in the real world, where there's only one decision threshold. Why then would we want to compare models using thresholds we'll never use? There are two answers for this.
 
 Firstly, comparing ROC curves in particular ways is like performing a statistical test that tells us not just that one model did better on this particular test set, but whether it's likely to continue to perform better in the future. This is out of the scope of this learning material, but it's worth keeping in mind.
 
-Secondly, the ROC curve shows, to some degree, how reliant the model is on having the perfect threshold. For example, if our model only works well when we have a decision threshold of 0.9, but terribly above or below this value, it's not a good design. We'd probably prefer to work with a model that works reasonably well for various thresholds, knowing that if the real-world data we come across is slightly different to our test set, our model's performance won't necessarily collapse.
+Secondly, the ROC curve shows, to some degree, how reliant the model is on having the perfect threshold. For example, if our model only works well when we have a decision threshold of 0.9, but terribly above or below this value, it's not a good design. We'd probably prefer to work with a model that works reasonably well for various thresholds. We'd know that if the real-world data we come across is slightly different to our test set, our model's performance won't necessarily collapse.
 
 ### How to compare ROCs?
 
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/includes/9-summary.md b/learn-pr/azure/optimize-model-performance-roc-auc/includes/9-summary.md
@@ -1,4 +1,5 @@
-We've covered receiver operator characteristic (ROC) curves in some depth. We learned they graph how often we mistakenly assign a true label against how often we correctly assign a true label. Each point on the graph represents one threshold that was applied.
+We've covered ROC curves in some depth. We learned they graph how often we mistakenly assign a true label against how often we correctly assign a true label. Each point on the graph represents one threshold that was applied.
+
+We learned how we can use ROC curves to tune our decision threshold in the final model. We also saw how AUC can give us an idea as to how reliant our model is to having the perfect decision threshold. It's also a handy measure to compare two models to one another.
 
-We learned how we can use ROC curves to tune our decision threshold in the final model. We also saw how area-under the curve (AUC) can give us an idea as to how reliant our model is to having the perfect decision threshold. It's also a handy measure to compare two models to one another.
 Congratulations on getting so far! As always, now that you have a new technique under your belt, the best you can do for your learning is practice using it on data you care about. By doing so, you'll gain experience and understand nuances that we haven't had time or space to cover here. Good luck!
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/index.yml b/learn-pr/azure/optimize-model-performance-roc-auc/index.yml
@@ -12,7 +12,7 @@ metadata:
     - TBD
     - ce-skilling-ai-copilot
 title: Measure and optimize model performance with ROC and AUC
-summary: Receiver operator characteristic curves are a powerful way to assess and fine-tune trained classification models. We introduce and explain the utility of these curves through learning content and practical exercises.
+summary: Receiver operator characteristic (ROC) curves are a powerful way to assess and fine-tune trained classification models. We introduce and explain the utility of these curves through learning content and practical exercises.
 abstract: | 
   In this module, you will:
   - Understand how to create ROC curves.
diff --git a/learn-pr/azure/optimize-model-performance-roc-auc/notebooks/9-3-evaluate-roc-curves.ipynb b/learn-pr/azure/optimize-model-performance-roc-auc/notebooks/9-3-evaluate-roc-curves.ipynb
@@ -382,6 +382,13 @@
     "If we continued this approach for all thresholds, we'd achieve a diagonal line."
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
   {
    "cell_type": "markdown",
    "metadata": {},

Original file line number	Diff line number	Diff line change
`@@ -382,6 +382,13 @@`
`382`	`382`	`"If we continued this approach for all thresholds, we'd achieve a diagonal line."`
`383`	`383`	`]`
`384`	`384`	`},`
	`385`	`+ {`
	`386`	`+ "cell_type": "code",`
	`387`	`+ "execution_count": null,`
	`388`	`+ "metadata": {},`
	`389`	`+ "outputs": [],`
	`390`	`+ "source": []`
	`391`	`+ },`
`385`	`392`	`{`
`386`	`393`	`"cell_type": "markdown",`
`387`	`394`	`"metadata": {},`