Skip to content

Commit 948ef88

Browse files
clustering index
1 parent 6a7057a commit 948ef88

File tree

1 file changed

+3
-15
lines changed

1 file changed

+3
-15
lines changed

source/clustering.Rmd

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -164,7 +164,7 @@ library(tidyverse)
164164
set.seed(1)
165165
```
166166

167-
Now we can load and preview the `penguins` data.
167+
Now we can load and preview the `penguins` data.\index{read function!read\_csv}
168168

169169
```{r message = FALSE, warning = FALSE}
170170
penguins <- read_csv("data/penguins.csv")
@@ -639,7 +639,7 @@ in the fourth iteration; both the centers and labels will remain the same from t
639639
640640
### Random restarts
641641

642-
Unlike the classification and regression models we studied in previous chapters, K-means \index{K-means!restart, nstart} can get "stuck" in a bad solution.
642+
Unlike the classification and regression models we studied in previous chapters, K-means \index{K-means!restart} can get "stuck" in a bad solution.
643643
For example, Figure \@ref(fig:10-toy-kmeans-bad-init) illustrates an unlucky random initialization by K-means.
644644

645645
```{r 10-toy-kmeans-bad-init, echo = FALSE, warning = FALSE, message = FALSE, fig.height = 3.25, fig.width = 3.75, fig.pos = "H", out.extra="", fig.align = "center", fig.cap = "Random initialization of labels."}
@@ -910,7 +910,7 @@ set.seed(1)
910910

911911
We can perform K-means clustering in R using a `tidymodels` workflow similar
912912
to those in the earlier classification and regression chapters.
913-
We will begin by loading the `tidyclust`\index{tidyclust} library, which contains the necessary
913+
We will begin by loading the `tidyclust`\index{K-means}\index{tidyclust} library, which contains the necessary
914914
functionality.
915915
```{r, echo = TRUE, warning = FALSE, message = FALSE}
916916
library(tidyclust)
@@ -993,18 +993,6 @@ clustered_data <- kmeans_fit |>
993993
clustered_data
994994
```
995995

996-
<!--
997-
If for some reason we need access to just the cluster assignments,
998-
we can extract those from the fit as a data frame using
999-
the `extract_cluster_assignment` function. Note that in this case,
1000-
the cluster assignments variable is named `.cluster`, while the `augment`
1001-
function earlier creates a variable named `.pred_cluster`.
1002-
1003-
```{r 10-kmeans-extract-clusterasgn}
1004-
extract_cluster_assignment(kmeans_fit)
1005-
```
1006-
-->
1007-
1008996
Now that we have the cluster assignments included in the `clustered_data` tidy data frame, we can
1009997
visualize them as shown in Figure \@ref(fig:10-plot-clusters-2).
1010998
Note that we are plotting the *un-standardized* data here; if we for some reason wanted to

0 commit comments

Comments
 (0)