changing seed for cv example in section 6.6.2

leem44 · leem44 · commit b95128709961 · 2022-02-24T17:11:00.000-08:00
diff --git a/classification2.Rmd b/classification2.Rmd
@@ -653,6 +653,11 @@ automatically. We set the `strata` argument to the categorical label variable
 (here, `Class`) to ensure that the training and validation subsets contain the
 right proportions of each category of observation.
 
+```{r 06-vfold-seed, echo = FALSE, warning = FALSE, message = FALSE}
+# hidden seed
+set.seed(14) 
+```
+
 ```{r 06-vfold}
 cancer_vfold <- vfold_cv(cancer_train, v = 5, strata = Class)
 cancer_vfold
@@ -689,9 +694,9 @@ of the classifier's validation accuracy across the folds. You will find results
 related to the accuracy in the row with `accuracy` listed under the `.metric` column. 
 You should consider the mean (`mean`) to be the estimated accuracy, while the standard 
 error (`std_err`) is a measure of how uncertain we are in the mean value. A detailed treatment of this
-is beyond the scope of this chapter; but roughly, if your estimated mean is 0.88 and standard
-error is 0.02, you can expect the *true* average accuracy of the 
-classifier to be somewhere roughly between 86% and 90% (although it may
+is beyond the scope of this chapter; but roughly, if your estimated mean is `r round(filter(collect_metrics(knn_fit), .metric == "accuracy")$mean,2)` and standard
+error is `r round(filter(collect_metrics(knn_fit), .metric == "accuracy")$std_err,2)`, you can expect the *true* average accuracy of the 
+classifier to be somewhere roughly between `r (round(filter(collect_metrics(knn_fit), .metric == "accuracy")$mean,2) - round(filter(collect_metrics(knn_fit), .metric == "accuracy")$std_err,2))*100`% and `r (round(filter(collect_metrics(knn_fit), .metric == "accuracy")$mean,2) + round(filter(collect_metrics(knn_fit), .metric == "accuracy")$std_err,2))*100`% (although it may
 fall outside this range). You may ignore the other columns in the metrics data frame,
 as they do not provide any additional insight.
 You can also ignore the entire second row with `roc_auc` in the `.metric` column,