clustering index

trevorcampbell · trevorcampbell · commit 948ef88c49b3 · 2023-11-16T12:38:48.000-08:00
diff --git a/source/clustering.Rmd b/source/clustering.Rmd
@@ -164,7 +164,7 @@ library(tidyverse)
 set.seed(1)
 ```
 
-Now we can load and preview the `penguins` data.
+Now we can load and preview the `penguins` data.\index{read function!read\_csv}
 
 ```{r message = FALSE, warning = FALSE}
 penguins <- read_csv("data/penguins.csv")
@@ -639,7 +639,7 @@ in the fourth iteration; both the centers and labels will remain the same from t
 
 ### Random restarts
 
-Unlike the classification and regression models we studied in previous chapters, K-means \index{K-means!restart, nstart} can get "stuck" in a bad solution.
+Unlike the classification and regression models we studied in previous chapters, K-means \index{K-means!restart} can get "stuck" in a bad solution.
 For example, Figure \@ref(fig:10-toy-kmeans-bad-init) illustrates an unlucky random initialization by K-means.
 
 ```{r 10-toy-kmeans-bad-init, echo = FALSE, warning = FALSE, message = FALSE, fig.height = 3.25, fig.width = 3.75, fig.pos = "H", out.extra="", fig.align = "center", fig.cap = "Random initialization of labels."}
@@ -910,7 +910,7 @@ set.seed(1)
 
 We can perform K-means clustering in R using a `tidymodels` workflow similar
 to those in the earlier classification and regression chapters.
-We will begin by loading the `tidyclust`\index{tidyclust} library, which contains the necessary
+We will begin by loading the `tidyclust`\index{K-means}\index{tidyclust} library, which contains the necessary
 functionality.
 ```{r, echo = TRUE, warning = FALSE, message = FALSE}
 library(tidyclust)
@@ -993,18 +993,6 @@ clustered_data <- kmeans_fit |>
 clustered_data
 ```
 
-<!--
-If for some reason we need access to just the cluster assignments,
-we can extract those from the fit as a data frame using
-the `extract_cluster_assignment` function. Note that in this case,
-the cluster assignments variable is named `.cluster`, while the `augment`
-function earlier creates a variable named `.pred_cluster`.
-
-```{r 10-kmeans-extract-clusterasgn}
-extract_cluster_assignment(kmeans_fit)
-```
--->
-
 Now that we have the cluster assignments included in the `clustered_data` tidy data frame, we can
 visualize them as shown in Figure \@ref(fig:10-plot-clusters-2).
 Note that we are plotting the *un-standardized* data here; if we for some reason wanted to