@@ -69,8 +69,8 @@ tumor images?
69
69
70
70
The trick is to split the data into a ** training set** and ** test set** ({numref}` fig:06-training-test ` )
71
71
and use only the ** training set** when building the classifier.
72
- Then, to evaluate the performance of the classifier, we first set aside the true labels from the ** test set** ,
73
- and then use the classifier to predict the labels in the ** test set** . If our predictions match the true
72
+ Then, to evaluate the performance of the classifier, we first set aside the labels from the ** test set** ,
73
+ and then use the classifier to predict the labels in the ** test set** . If our predictions match the actual
74
74
labels for the observations in the ** test set** , then we have some
75
75
confidence that our classifier might also accurately predict the class
76
76
labels for new observations without known class labels.
@@ -102,12 +102,12 @@ Splitting the data into training and testing sets.
102
102
``` {index} accuracy
103
103
```
104
104
105
- How exactly can we assess how well our predictions match the true labels for
105
+ How exactly can we assess how well our predictions match the actual labels for
106
106
the observations in the test set? One way we can do this is to calculate the
107
107
prediction ** accuracy** . This is the fraction of examples for which the
108
108
classifier made the correct prediction. To calculate this, we divide the number
109
109
of correct predictions by the number of predictions made.
110
- The process for assessing if our predictions match the true labels in the
110
+ The process for assessing if our predictions match the actual labels in the
111
111
test set is illustrated in {numref}` fig:06-ML-paradigm-test ` .
112
112
113
113
$$ \mathrm{accuracy} = \frac{\mathrm{number \; of \; correct \; predictions}}{\mathrm{total \; number \; of \; predictions}} $$
@@ -139,10 +139,10 @@ a test set of 65 observations.
139
139
* -
140
140
- Predicted Malignant
141
141
- Predicted Benign
142
- * - **Truly Malignant**
142
+ * - **Actually Malignant**
143
143
- 1
144
144
- 3
145
- * - **Truly Benign**
145
+ * - **Actually Benign**
146
146
- 4
147
147
- 57
148
148
```
@@ -607,7 +607,7 @@ knn_fit
607
607
Now that we have a $K$-nearest neighbors classifier object, we can use it to
608
608
predict the class labels for our test set. We will use the ` assign ` method to
609
609
augment the original test data with a column of predictions, creating the
610
- ` cancer_test_predictions ` data frame. The ` Class ` variable contains the true
610
+ ` cancer_test_predictions ` data frame. The ` Class ` variable contains the actual
611
611
diagnoses, while the ` predicted ` contains the predicted diagnoses from the
612
612
classifier. Note that below we print out just the ` ID ` , ` Class ` , and ` predicted `
613
613
variables in the output data frame.
@@ -640,9 +640,9 @@ correct_preds.shape[0] / cancer_test_predictions.shape[0]
640
640
641
641
The ` scitkit-learn ` package also provides a more convenient way to do this using
642
642
the ` score ` method. To use the ` score ` method, we need to specify two arguments:
643
- predictors and true labels. We pass the same test data
643
+ predictors and the actual labels. We pass the same test data
644
644
for the predictors that we originally passed into ` predict ` when making predictions,
645
- and we provide the true labels via the ` cancer_test["Class"] ` series.
645
+ and we provide the actual labels via the ` cancer_test["Class"] ` series.
646
646
647
647
``` {code-cell} ipython3
648
648
cancer_acc_1 = knn_fit.score(
@@ -664,9 +664,9 @@ The output shows that the estimated accuracy of the classifier on the test data
664
664
was {glue: text }` cancer_acc_1 ` %.
665
665
We can also look at the * confusion matrix* for the classifier
666
666
using the ` crosstab ` function from ` pandas ` . A confusion matrix shows how many
667
- observations of each (true ) label were classified as each (predicted) label.
667
+ observations of each (actual ) label were classified as each (predicted) label.
668
668
The ` crosstab ` function
669
- takes two arguments: the true labels first, then the predicted labels second.
669
+ takes two arguments: the actual labels first, then the predicted labels second.
670
670
671
671
``` {code-cell} ipython3
672
672
pd.crosstab(
@@ -703,8 +703,8 @@ glue("confu_recall_0", "{:0.0f}".format(100*c11/(c11+c10)))
703
703
The confusion matrix shows {glue: text }` confu11 ` observations were correctly predicted
704
704
as malignant, and {glue: text }` confu00 ` were correctly predicted as benign.
705
705
It also shows that the classifier made some mistakes; in particular,
706
- it classified {glue: text }` confu10 ` observations as benign when they were truly malignant,
707
- and {glue: text }` confu01 ` observations as malignant when they were truly benign.
706
+ it classified {glue: text }` confu10 ` observations as benign when they were actually malignant,
707
+ and {glue: text }` confu01 ` observations as malignant when they were actually benign.
708
708
Using our formulas from earlier, we see that the accuracy agrees with what Python reported,
709
709
and can also compute the precision and recall of the classifier:
710
710
@@ -758,11 +758,11 @@ of the time, a classifier with 99% accuracy is not terribly impressive (just alw
758
758
And beyond just accuracy, we need to consider the precision and recall: as mentioned
759
759
earlier, the * kind* of mistake the classifier makes is
760
760
important in many applications as well. In the previous example with 99% benign observations, it might be very bad for the
761
- classifier to predict "benign" when the true class is "malignant" (a false negative), as this
761
+ classifier to predict "benign" when the actual class is "malignant" (a false negative), as this
762
762
might result in a patient not receiving appropriate medical attention. In other
763
763
words, in this context, we need the classifier to have a * high recall* . On the
764
764
other hand, it might be less bad for the classifier to guess "malignant" when
765
- the true class is "benign" (a false positive), as the patient will then likely see a doctor who
765
+ the actual class is "benign" (a false positive), as the patient will then likely see a doctor who
766
766
can provide an expert diagnosis. In other words, we are fine with sacrificing
767
767
some precision in the interest of achieving high recall. This is why it is
768
768
important not only to look at accuracy, but also the confusion matrix.
0 commit comments