Skip to content

Commit 11256f4

Browse files
cls1 index
1 parent 948ef88 commit 11256f4

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

source/classification1.Rmd

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1295,7 +1295,7 @@ upsampled_plot
12951295

12961296
### Missing data
12971297

1298-
One of the most common issues in real data sets in the wild is *missing data*,
1298+
One of the most common issues in real data sets in the wild is *missing data*,\index{missing data}
12991299
i.e., observations where the values of some of the variables were not recorded.
13001300
Unfortunately, as common as it is, handling missing data properly is very
13011301
challenging and generally relies on expert knowledge about the data, setting,
@@ -1329,7 +1329,7 @@ data. So how can we perform K-nearest neighbors classification in the presence
13291329
of missing data? Well, since there are not too many observations with missing
13301330
entries, one option is to simply remove those observations prior to building
13311331
the K-nearest neighbors classifier. We can accomplish this by using the
1332-
`drop_na` function from `tidyverse` prior to working with the data.
1332+
`drop_na` function from `tidyverse` prior to working with the data.\label{missing data!drop\_na}
13331333

13341334
```{r 05-naomit}
13351335
no_missing_cancer <- missing_cancer |> drop_na()
@@ -1342,7 +1342,8 @@ possible approach is to *impute* the missing entries, i.e., fill in synthetic
13421342
values based on the other observations in the data set. One reasonable choice
13431343
is to perform *mean imputation*, where missing entries are filled in using the
13441344
mean of the present entries in each variable. To perform mean imputation, we
1345-
add the `step_impute_mean` step to the `tidymodels` preprocessing recipe.
1345+
add the `step_impute_mean` \index{recipe!step\_impute\_mean}\index{missing data!mean imputation}
1346+
step to the `tidymodels` preprocessing recipe.
13461347
```{r 05-impute, results=FALSE, message=FALSE, echo=TRUE}
13471348
impute_missing_recipe <- recipe(Class ~ ., data = missing_cancer) |>
13481349
step_impute_mean(all_predictors()) |>

0 commit comments

Comments
 (0)