Skip to content

Commit 1661d0e

Browse files
committed
fixed one more fig in classification2
1 parent 6765dd3 commit 1661d0e

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

classification2.Rmd

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -863,23 +863,6 @@ regardless of what the new observation looks like. In general, if the model
863863
*isn't influenced enough* by the training data, it is said to **underfit** the
864864
data.
865865

866-
**Overfitting:** \index{overfitting!classification} In contrast, when we decrease the number of neighbors, each
867-
individual data point has a stronger and stronger vote regarding nearby points.
868-
Since the data themselves are noisy, this causes a more "jagged" boundary
869-
corresponding to a *less simple* model. If you take this case to the extreme,
870-
setting $K = 1$, then the classifier is essentially just matching each new
871-
observation to its closest neighbor in the training data set. This is just as
872-
problematic as the large $K$ case, because the classifier becomes unreliable on
873-
new data: if we had a different training set, the predictions would be
874-
completely different. In general, if the model *is influenced too much* by the
875-
training data, it is said to **overfit** the data.
876-
877-
Both overfitting and underfitting are problematic and will lead to a model
878-
that does not generalize well to new data. When fitting a model, we need to strike
879-
a balance between the two. You can see these two effects in Figure
880-
\@ref(fig:06-decision-grid-K), which shows how the classifier changes as
881-
we set the number of neighbors $K$ to 1, 7, 20, and 300.
882-
883866
```{r 06-decision-grid-K, echo = FALSE, message = FALSE, fig.height = 10, fig.width = 10, fig.pos = "H", out.extra="", fig.cap = "Effect of K in overfitting and underfitting."}
884867
ks <- c(1, 7, 20, 300)
885868
plots <- list()
@@ -935,6 +918,23 @@ p_grid <- plot_grid(plotlist = p_no_legend, ncol = 2)
935918
plot_grid(p_grid, legend, ncol = 1, rel_heights = c(1, 0.2))
936919
```
937920

921+
**Overfitting:** \index{overfitting!classification} In contrast, when we decrease the number of neighbors, each
922+
individual data point has a stronger and stronger vote regarding nearby points.
923+
Since the data themselves are noisy, this causes a more "jagged" boundary
924+
corresponding to a *less simple* model. If you take this case to the extreme,
925+
setting $K = 1$, then the classifier is essentially just matching each new
926+
observation to its closest neighbor in the training data set. This is just as
927+
problematic as the large $K$ case, because the classifier becomes unreliable on
928+
new data: if we had a different training set, the predictions would be
929+
completely different. In general, if the model *is influenced too much* by the
930+
training data, it is said to **overfit** the data.
931+
932+
Both overfitting and underfitting are problematic and will lead to a model
933+
that does not generalize well to new data. When fitting a model, we need to strike
934+
a balance between the two. You can see these two effects in Figure
935+
\@ref(fig:06-decision-grid-K), which shows how the classifier changes as
936+
we set the number of neighbors $K$ to 1, 7, 20, and 300.
937+
938938
## Summary
939939

940940
Classification algorithms use one or more quantitative variables to predict the

0 commit comments

Comments
 (0)