tc polish new text for increasing folds vs std error text

trevorcampbell · trevorcampbell · commit 4ef5ed6214e4 · 2022-02-24T18:18:48.000-08:00
diff --git a/classification2.Rmd b/classification2.Rmd
@@ -712,10 +712,10 @@ accuracy estimate will be (lower standard error). However, we are limited
 by computational power: the
 more folds we choose, the  more computation it takes, and hence the more time
 it takes to run the analysis. So when you do cross-validation, you need to
-consider the size of the data, and the speed of the algorithm (e.g., $K$-nearest
-neighbor) and the speed of your computer. In practice, this is a 
-trial-and-error process, but typically $C$ is chosen to be either 5 or 10. Here we use 10-fold cross-validation rather
-than 5-fold and we see we get a lower standard error:
+consider the size of the data, the speed of the algorithm (e.g., $K$-nearest
+neighbors), and the speed of your computer. In practice, this is a 
+trial-and-error process, but typically $C$ is chosen to be either 5 or 10. Here 
+we will try 10-fold cross-validation to see if we get a lower standard error:
 
 ```{r 06-10-fold}
 cancer_vfold <- vfold_cv(cancer_train, v = 10, strata = Class)
@@ -728,19 +728,19 @@ vfold_metrics <- workflow() |>
 
 vfold_metrics
 ```
-
-Increasing the number of folds will usually result in a lower standard error, though this is 
-not always the case. Due to random noise, sometimes we might get a higher value. In this example, 
-the standard error decreased slightly, but not by a lot. 
+In this case, using 10-fold instead of 5-fold cross validation did reduce the standard error, although
+by only an insignificant amount. In fact, due to the randomness in how the data are split, sometimes
+you might even end up with a *higher* standard error when increasing the number of folds!
+We can make the reduction in standard error more dramatic by increasing the number of folds 
+by a large amount. In the following code we show the result when $C = 50$; 
+picking such a large number of folds often takes a long time to run in practice, 
+so we usually stick to 5 or 10.
 
 ```{r 06-50-fold-seed, echo = FALSE, warning = FALSE, message = FALSE}
 # hidden seed
 set.seed(1)
 ```
 
-We can see 
-how the standard error decreases by a more meaningful amount when we use 50-fold cross-validation rather
-than 5-fold or 10-fold:
 ```{r 06-50-fold}
 cancer_vfold_50 <- vfold_cv(cancer_train, v = 50, strata = Class)
 
@@ -753,8 +753,6 @@ vfold_metrics_50 <- workflow() |>
 vfold_metrics_50
 ```
 
-In practice, we usually have a lot of data and setting $C$ to such a large number often takes a long time to run, so we usually stick to 5 or 10 folds.
-
 ### Parameter value selection
 
 Using 5- and 10-fold cross-validation, we have estimated that the prediction