You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: clustering.Rmd
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -107,7 +107,7 @@ collected by [Dr. Kristen Gorman](https://www.uaf.edu/cfos/people/faculty/detail
107
107
the [Palmer Station, Antarctica Long Term Ecological Research Site](https://pal.lternet.edu/), and includes
108
108
measurements for adult penguins found near there [@palmerpenguins]. We have
109
109
modified the data set for use in this chapter. Here we will focus on using two
110
-
variables---penguin bill and flipper length, both in millimeters---to determine whether
110
+
variables—penguin bill and flipper length, both in millimeters—to determine whether
111
111
there are distinct types of penguins in our data.
112
112
Understanding this might help us with species discovery and classification in a data-driven
113
113
way.
@@ -332,7 +332,7 @@ base <- base +
332
332
base
333
333
```
334
334
335
-
The larger the value of $S^2$, the more spread-out the cluster is, since large $S^2$ means that points are far from the cluster center.
335
+
The larger the value of $S^2$, the more spreadout the cluster is, since large $S^2$ means that points are far from the cluster center.
336
336
Note, however, that "large" is relative to *both* the scale of the variables for clustering *and* the number of points in the cluster. A cluster where points are very close to the center might still have a large $S^2$ if there are many data points in the cluster.
337
337
338
338
After we have calculated the WSSD for all the clusters,
@@ -789,7 +789,7 @@ mean of the sample is \$`r round(estimates$sample_mean, 2)`.
789
789
Remember, in practice, we usually only have this one sample from the population. So
790
790
this sample and estimate are the only data we can work with.
791
791
792
-
We now perform steps (1) - (5) listed above to generate a single bootstrap
792
+
We now perform steps 1–5 listed above to generate a single bootstrap
793
793
sample in R and calculate a point estimate from that bootstrap sample. We will
794
794
use the `rep_sample_n` function as we did when we were
795
795
creating our sampling distribution. But critically, note that we now
@@ -1173,4 +1173,4 @@ found in Chapter \@ref(move-to-your-own-machine).
1173
1173
## Additional resources
1174
1174
1175
1175
- Chapters 7 to 10 of [*Modern Dive*](https://moderndive.com/) provide a great next step in learning about inference. In particular, Chapters 7 and 8 cover sampling and bootstrapping using `tidyverse` and `infer` in a slightly more in-depth manner than the present chapter. Chapters 9 and 10 take the next step beyond the scope of this chapter and begin to provide some of the initial mathematical underpinnings of inference and more advanced applications of the concept of inference in testing hypotheses and performing regression. This material offers a great starting point for getting more into the technical side of statistics.
1176
-
- Chapters 4 to 7 of [*OpenIntro Statistics - Fourth Edition*](https://www.openintro.org/) provide a good next step after *Modern Dive*. Although it is still certainly an introductory text, things get a bit more mathematical here. Depending on your background, you may actually want to start going through Chapters 1 to 3 first, where you will learn some fundamental concepts in probability theory. Although it may seem like a diversion, probability theory is *the language of statistics*; if you have a solid grasp of probability, more advanced statistics will come naturally to you!
1176
+
- Chapters 4 to 7 of [*OpenIntro Statistics*](https://www.openintro.org/) provide a good next step after *Modern Dive*. Although it is still certainly an introductory text, things get a bit more mathematical here. Depending on your background, you may actually want to start going through Chapters 1 to 3 first, where you will learn some fundamental concepts in probability theory. Although it may seem like a diversion, probability theory is *the language of statistics*; if you have a solid grasp of probability, more advanced statistics will come naturally to you!
0 commit comments