diff --git a/public/img/machine-learning/metrics-01.svg b/public/img/machine-learning/metrics-01.svg index eb53c44..0bc527f 100644 --- a/public/img/machine-learning/metrics-01.svg +++ b/public/img/machine-learning/metrics-01.svg @@ -1,16 +1,7040 @@ - + + + + + + + + + + + + + + + + + + - - - \ No newline at end of file + + + + + + + PERFORMANCE METRICS + + + + + + + Classification + + + + + + + + + + + + + + + + + + + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + + + + + + + + + + + + + predicted + + + + + + + + + + + + + + + SPAM + + + NOT_SPAM + + + SPAM + + + NOT_SPAM + + + True + Positive + + + The model correctly classifies + as SPAM a SPAM e-mail + + + False + Negative + + + The model incorrectly classifies as + NOT_SPAM a SPAM e-mail + + + False + Positive + + + The model incorrectly classifies as + SPAM a NOT_SPAM e-mail + + + True + Negative + + + The model correctly classifies as + NOT_SPAM a NOT_SPAM e-mail + + + Confusion matrix is useful to calculate + precision, recall, and accuracy + + + !! + + + + + + + + + + + + + + + + + + + + + True + Positive + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + True + Negative + + + + + + + + + + + + + + + + + + + Number of times our model + is right overall + + + a.k.a. type-I-error, false alarm + + + a.k.a. type-II-error, miss + + + Accuracy does not have value when: + + + - Classes are imbalanced + + + Class A: 990 samples + Class B: 10 sample + Model predicts always A + Accuracy: 99% + + + - Errors are not equally important + + + Spam/Not spam problem + + + False + Negatives + + + more tolerant than + + + False + Positives + + + + + + + + + + + + + + + Possible solution + + + + + + + + + + + + + + + Per-class Accuracy + + + 1. Calculate accuracy for each class + 2. Take the average of C individual + accuracy measures + + + Cost-sentitive Accuracy + + + 1. Assign costs to + 2. Take the weighted sum (with + costs), instead of the classic + sum in the Accuracy equation + + + FN + + + and + + + FP + + + + + + + + + + + + + + + Not appropriate for a multiclass classification problem + where many classes have very few examples. The accuracy + values obtained from them will not be statistically reliable. + + + Use Cohen's Kappa Statistic instead! + + + + + + Harmonic Mean of Precision and Recall + + + 2 + + + * + + + Precision + + + * + + + Recall + + + + + + + + + Precision + + + Recall + + + + + + + + + + + + + + + + + + + Can be parametrized with a positive + real + + + β + + + F + + + β + + + = + + + (1+ + + + β + + + ) + + + 2 + + + * + + + Precision + + + * + + + Recall + + + + + + + + + Precision + + + Recall + + + + + + + ) + + + β + + + 2 + + + ( + + + * + + + + + + + + + + + + + + + Recall considered times more + important than precision + + + β + + + + + + + + + + Confusion Matrix + + + + + + + Accuracy + + + + + + + + + + F1-Score + + + + + + + Cohen's Kappa Statistic + + + + + + Applies to multiclass and + imbalanced problems + + + + + + Measures how much better your classification + model is performing compared to a classifier + that randomly guesses a class according to + the frequency of each class + + + observed + agreement + + + + + + + + + + + + + + + expected + agreement + + + + + + = + + + True + Positive + + + + + + + True + Negative + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + + + + + + + + + + + + + + + + + + + + + + = + + + + + + + + + + = + + + True + Positive + + + + + + + False + Negative + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + + + + + + + + + + + + + * + + + True + Positive + + + + + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + + + + + + + + + + + + + False + Positive + + + = + + + + + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + + + + + + + + + + + + + * + + + + + + + + + + + + + True + Positive + + + False + Negative + + + False + Positive + + + True + Negative + + + + + + + + + + + + + + + + + + False + Positive + + + True + Negative + + + False + Negative + + + True + Negative + + + Always <= 1 + + + [0,0.6]: the model has a problem + + + (0.6,0.8]: the model is good + + + (0.8, 1): the model is very good! + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + The Receiver Operative Curve (ROC) + is a probability curve obtained by + plotting the True Positive Rate + against the False Positive Rate. + + + + + + + + + + + + + + + True Positive + Rate + + + False Positive + Rate + + + a.k.a, recall + + + = + + + = + + + True + Positive + + + + + + + + + True + Positive + + + + + + + False + Negative + + + + + + + + + + + + + False + Positive + + + False + Positive + + + True + Negative + + + Instructions: + + + 1. Discretize the range of the scores + 2. Use each discrete value + as prediction threshold + 3. The greater the Area Under + Curve (AUC), the better the classifier + e.g., AUC: 1 Perfect Classifier + + + + If threshold == 0.7: + apply the model to each example + and get the score + + if score >= 0.7: + prediction is positive + else: + prediction is negative + + + example + + + + + + + + + + + + + + + True Positive + Rate + + + True Positive + Rate + + + True Positive + Rate + + + True Positive + Rate + + + False Positive + Rate + + + False Positive + Rate + + + False Positive + Rate + + + False Positive + Rate + + + 0 + + + 0 + + + 0 + + + 0 + + + 1 + + + 1 + + + 1 + + + 1 + + + 1 + + + 1 + + + 1 + + + 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + AUC: 0.4 + + + + + + + + + AUC: 0.5 + + + + + + + + + AUC: 0.6 + + + + + + + + + AUC: 0.85 + + + N.B. If AUC > 0.5: + better than + random classifier + + + + + + + + + + + + + + + In practice + + + Try to get a True Positive Rate close to 1 while keeping a + False Positive Rate close to 0 + + + + + + + + + + + + + + + + + + + ROC Curve and AUC + + + The Area Under the Curve (AUC) measures + the degree of separability of the two classes + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Class 1 + + + Class 0 + + + Probability + distribution + + + Ideal scenario: the model well-separates the + classes without mistakes + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Class 1 + + + Class 0 + + + Probability + distribution + + + The model makes some mistakes, increasing + the number of False Positives and False Negatives + + + + + + + + + 0.5 + Threshold + + + + + + + + + 0.5 + Threshold + + + FN + + + FP + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Class 1 + + + Class 0 + + + Probability + distribution + + + 0.5 + Threshold + + + Above 0.5 the test + point is predicted as + positive, otherwise + as negative + + + TN + + + TP + + + TN + + + TP + + + The model makes some mistakes, increasing + the number of False Positives and False Negatives + + + + + + + + + + + + + + + AUC ∼ 0.7 + + + AUC ∼ 0.5 + + + AUC = 1.0 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Class 1 + + + Class 0 + + + Probability + distribution + + + 0.5 + Threshold + + + The model inverted the classes. The negative examples + are predicted as positive and vice versa + + + AUC = 0.0 + + + + + + + + + TP + + + TN + + + illustrated-machine-learning.github.io + + + + + + True + Positive + + + + + + + + + True + Positive + + + False + Positive + + + + + + + Number of positive class + predictions belonging to the + positive class + + + True + Positive + + + + + + + + + True + Positive + + + + + + + False + Negative + + + Number of positive + class predictions made + out of all positive + examples in the dataset + + + Use cases + + + High precision + + + a.k.a. sensitivity, + True Positive Rate + + + Maximize chances every + recommended video will + be liked by the user + + + Low Precision + + + The model may detect cancer + even if the user does not have it. + Let's have a doctor. double check it.. + better safe than sorry! + + + + + + + Precision & Recall + + + Precision + + + + + + + + + Recall + + + + + + + + + + + + + + + + + + + + + Youtube + recommendation + + + Low Recall + + + There may be videos not + recommended by the model + the user would like + + + High recall + + + Maximize chances to + detect cancer should + the patient have it + + + Cancer + detection + +