Skip to content

Commit 8775121

Browse files
more writing improvements in classfcn2 prec/rec
1 parent c99c18f commit 8775121

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

source/classification2.Rmd

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ and the classifier will be asked to decide whether the tumor is benign or
7575
malignant. The key word here is *new*: our classifier is "good" if it provides
7676
accurate predictions on data *not seen during training*, as this implies that
7777
it has actually learned about the relationship between the predictor variables and response variable,
78-
as opposed to simply memorizing and regurgitating individual training data examples.
78+
as opposed to simply memorizing the labels of individual training data examples.
7979
But then, how can we evaluate our classifier without visiting the hospital to collect more
8080
tumor images?
8181

@@ -142,9 +142,10 @@ $$\mathrm{accuracy} = \frac{\mathrm{number \; of \; correct \; predictions}}{\
142142

143143
But we can also see that the classifier only identified 1 out of 4 total malignant
144144
tumors; in other words, it misclassified 75% of the malignant cases present in the
145-
data set! Since we are particularly interested in identifying malignant cases
146-
in this data analysis context, this classifier would likely be unacceptable
147-
even with an accuracy of 89%.
145+
data set! In this example, misclassifying a malignant tumor is a potentially
146+
disastrous error, since it may lead to a patient who requires treatment not receiving it.
147+
Since we are particularly interested in identifying malignant cases, this
148+
classifier would likely be unacceptable even with an accuracy of 89%.
148149

149150
Focusing more on one label than the other is
150151
common in classification problems. In such cases, we typically refer to the label we are more

0 commit comments

Comments
 (0)