Skip to content

Commit f9649b1

Browse files
committed
fixxing spacing between words, pages etc
1 parent 31ee1eb commit f9649b1

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

classification2.Rmd

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ in the analysis, would we not get a different result each time?
120120
The trick is that in R—and other programming languages—randomness
121121
is not actually random! Instead, R uses a *random number generator* that
122122
produces a sequence of numbers that
123-
are completely determined by a \index{seed} \index{random seed|see{seed}}
123+
are completely determined by a\index{seed} \index{random seed|see{seed}}
124124
*seed value*. Once you set the seed value
125125
using the \index{seed!set.seed} `set.seed` function, everything after that point may *look* random,
126126
but is actually totally reproducible. As long as you pick the same seed
@@ -157,6 +157,7 @@ set.seed(1)
157157
random_numbers <- sample(0:9, 10, replace=TRUE)
158158
random_numbers
159159
160+
set.seed(1)
160161
random_numbers <- sample(0:9, 10, replace=TRUE)
161162
random_numbers
162163
```
@@ -170,6 +171,7 @@ set.seed(4235)
170171
random_numbers <- sample(0:9, 10, replace=TRUE)
171172
random_numbers
172173
174+
set.seed(4235)
173175
random_numbers <- sample(0:9, 10, replace=TRUE)
174176
random_numbers
175177
```
@@ -323,7 +325,7 @@ our test data does not influence any aspect of our model training. Once we have
323325
created the standardization preprocessor, we can then apply it separately to both the
324326
training and test data sets.
325327

326-
Fortunately, the `recipe` framework from `tidymodels` helps us handle \index{recipe}\index{recipe!step\_scale}\index{recipe!step\_center}
328+
Fortunately, the `recipe` framework from `tidymodels` helps us handle\index{recipe}\index{recipe!step\_scale}\index{recipe!step\_center}
327329
this properly. Below we construct and prepare the recipe using only the training
328330
data (due to `data = cancer_train` in the first line).
329331

@@ -411,7 +413,6 @@ the table of predicted labels and correct labels, using the `conf_mat` function:
411413
```{r 06-confusionmat}
412414
confusion <- cancer_test_predictions |>
413415
conf_mat(truth = Class, estimate = .pred_class)
414-
415416
confusion
416417
```
417418

@@ -497,7 +498,7 @@ for the application.
497498
## Tuning the classifier
498499

499500
The vast majority of predictive models in statistics and machine learning have
500-
*parameters*. A *parameter* \index{parameter}\index{tuning parameter|see{parameter}}
501+
*parameters*. A *parameter*\index{parameter}\index{tuning parameter|see{parameter}}
501502
is a number you have to pick in advance that determines
502503
some aspect of how the model behaves. For example, in the $K$-nearest neighbors
503504
classification algorithm, $K$ is a parameter that we have to pick
@@ -663,7 +664,7 @@ cancer_vfold <- vfold_cv(cancer_train, v = 5, strata = Class)
663664
cancer_vfold
664665
```
665666

666-
Then, when we create our data analysis workflow, we use the `fit_resamples` function \index{cross-validation!fit\_resamples}\index{tidymodels!fit\_resamples}
667+
Then, when we create our data analysis workflow, we use the `fit_resamples` function\index{cross-validation!fit\_resamples}\index{tidymodels!fit\_resamples}
667668
instead of the `fit` function for training. This runs cross-validation on each
668669
train/validation split.
669670

@@ -689,7 +690,7 @@ knn_fit <- workflow() |>
689690
knn_fit
690691
```
691692

692-
The `collect_metrics` \index{tidymodels!collect\_metrics}\index{cross-validation!collect\_metrics} function is used to aggregate the *mean* and *standard error*
693+
The `collect_metrics`\index{tidymodels!collect\_metrics}\index{cross-validation!collect\_metrics} function is used to aggregate the *mean* and *standard error*
693694
of the classifier's validation accuracy across the folds. You will find results
694695
related to the accuracy in the row with `accuracy` listed under the `.metric` column.
695696
You should consider the mean (`mean`) to be the estimated accuracy, while the standard

0 commit comments

Comments
 (0)