Skip to content

Commit cc80767

Browse files
convert arrange/slice to slice_min in classification and regression
1 parent 9124b80 commit cc80767

File tree

2 files changed

+12
-7
lines changed

2 files changed

+12
-7
lines changed

source/classification1.Rmd

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,14 @@ the $K=5$ neighbors that are nearest to our new point.
460460
You will see in the `mutate` \index{mutate} step below, we compute the straight-line
461461
distance using the formula above: we square the differences between the two observations' perimeter
462462
and concavity coordinates, add the squared differences, and then take the square root.
463+
In order to find the $K=5$ nearest neighbors, we will use the `slice_min` function.
464+
465+
> **Note:** Recall that in Chapter \@ref(intro), we used `arrange` followed by `slice` to
466+
> obtain the ten rows with the *largest* values of a variable. We could have instead used
467+
> the `slice_max` function for this purpose. The `slice_min` and `slice_max` functions
468+
> achieve the same goal as `arrange` followed by `slice`, but are slightly more efficient
469+
> because they are specialized for this purpose. In general, it is good to use more specialized
470+
> functions when they are available!
463471
464472
```{r 05-multiknn-1, echo = FALSE, fig.height = 3.5, fig.width = 4.5, fig.pos = "H", out.extra="", fig.cap="Scatter plot of concavity versus perimeter with new observation represented as a red diamond."}
465473
perim_concav <- bind_rows(cancer,
@@ -499,8 +507,7 @@ cancer |>
499507
select(ID, Perimeter, Concavity, Class) |>
500508
mutate(dist_from_new = sqrt((Perimeter - new_obs_Perimeter)^2 +
501509
(Concavity - new_obs_Concavity)^2)) |>
502-
arrange(dist_from_new) |>
503-
slice(1:5) # take the first 5 rows
510+
slice_min(dist_from_new, n = 5) # take the 5 rows of minimum distance
504511
```
505512

506513
In Table \@ref(tab:05-multiknn-mathtable) we show in mathematical detail how
@@ -590,8 +597,7 @@ cancer |>
590597
mutate(dist_from_new = sqrt((Perimeter - new_obs_Perimeter)^2 +
591598
(Concavity - new_obs_Concavity)^2 +
592599
(Symmetry - new_obs_Symmetry)^2)) |>
593-
arrange(dist_from_new) |>
594-
slice(1:5) # take the first 5 rows
600+
slice_min(dist_from_new, n = 5) # take the 5 rows of minimum distance
595601
```
596602

597603
Based on $K=5$ nearest neighbors with these three predictors, we would classify

source/regression1.Rmd

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -233,13 +233,12 @@ sale price might be.
233233
For the example shown in Figure \@ref(fig:07-small-eda-regr),
234234
we find and label the 5 nearest neighbors to our observation
235235
of a house that is 2,000 square feet.
236-
\index{mutate}\index{slice}\index{arrange}\index{abs}
236+
\index{mutate}\index{slice\_min}\index{abs}
237237

238238
```{r 07-find-k3}
239239
nearest_neighbors <- small_sacramento |>
240240
mutate(diff = abs(2000 - sqft)) |>
241-
arrange(diff) |>
242-
slice(1:5) #subset the first 5 rows
241+
slice_min(diff, n = 5)
243242
244243
nearest_neighbors
245244
```

0 commit comments

Comments
 (0)