Skip to content
This repository was archived by the owner on Oct 5, 2025. It is now read-only.

Commit 2e4d013

Browse files
authored
Merge pull request #2 from jl5000/master
Update 20-uml.Rmd
2 parents 6b78152 + f009b9a commit 2e4d013

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

20-uml.Rmd

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ where
4343

4444
> Challenge:
4545
>
46-
> - To learn about k-means, let's use the `iris` with the sepal and
46+
> - To learn about k-means, let's use the `iris` dataset with the sepal and
4747
> petal length variables only (to facilitate visualisation). Create
4848
> such a data matrix and name it `x`
4949
@@ -63,7 +63,7 @@ cl <- kmeans(x, 3, nstart = 10)
6363
> - The actual results of the algorithms, i.e. the cluster membership
6464
> can be accessed in the `clusters` element of the clustering result
6565
> output. Use it to colour the inferred clusters to generate a figure
66-
> like shown below.
66+
> like that shown below.
6767
6868
```{r solkmplot, echo=FALSE, fig.cap = "k-means algorithm on sepal and petal lengths"}
6969
plot(x, col = cl$cluster)
@@ -139,7 +139,7 @@ a global minimum.
139139

140140
> Challenge:
141141
>
142-
> Repeat kmeans on our `x` data multiple times, setting the number of
142+
> Repeat k-means on our `x` data multiple times, setting the number of
143143
> iterations to 1 or greater and check whether you repeatedly obtain
144144
> the same results. Try the same with random data of identical
145145
> dimensions.
@@ -203,13 +203,13 @@ plot(ks, tot_within_ss, type = "b")
203203

204204
### How does hierarchical clustering work
205205

206-
**Initialisation**: Starts by assigning each of the n point its own cluster
206+
**Initialisation**: Starts by assigning each of the n points its own cluster
207207

208208
**Iteration**
209209

210210
1. Find the two nearest clusters, and join them together, leading to
211211
n-1 clusters
212-
2. Continue merging cluster process until all are grouped into a
212+
2. Continue the cluster merging process until all are grouped into a
213213
single cluster
214214

215215
**Termination:** All observations are grouped within a single cluster.
@@ -323,7 +323,7 @@ as well as supervised methods, as we will see in the next chapter.
323323

324324
A typical way to pre-process the data prior to learning is to scale
325325
the data, or apply principal component analysis (next section). Scaling
326-
assures that all data columns have mean 0 and standard deviate 1.
326+
assures that all data columns have a mean of 0 and standard deviation of 1.
327327

328328
In R, scaling is done with the `scale` function.
329329

@@ -348,11 +348,11 @@ plot(hcl2, main = "scaled data")
348348
## Principal component analysis (PCA)
349349

350350
**Dimensionality reduction** techniques are widely used and versatile
351-
techniques that can be used o
351+
techniques that can be used to:
352352

353353
- find structure in features
354354
- pre-processing for other ML algorithms, and
355-
- as an aid in visualisation.
355+
- aid in visualisation.
356356

357357
The basic principle of dimensionality reduction techniques is to
358358
transform the data into a new space that summarise properties of the

0 commit comments

Comments
 (0)