@@ -1332,13 +1332,13 @@ Best subset selection is applicable to any classification method ($K$-NN or othe
1332
1332
However, it becomes very slow when you have even a moderate
1333
1333
number of predictors to choose from (say, around 10). This is because the number of possible predictor subsets
1334
1334
grows very quickly with the number of predictors, and you have to train the model (itself
1335
- a slow process!) for each one. For example, if we have $2$ predictors&mdash ; let's call
1335
+ a slow process!) for each one. For example, if we have 2 predictors&mdash ; let's call
1336
1336
them A and B&mdash ; then we have 3 variable sets to try: A alone, B alone, and finally A
1337
- and B together. If we have $3$ predictors&mdash ; A, B, and C&mdash ; then we have 7
1337
+ and B together. If we have 3 predictors&mdash ; A, B, and C&mdash ; then we have 7
1338
1338
to try: A, B, C, AB, BC, AC, and ABC. In general, the number of models
1339
1339
we have to train for $m$ predictors is $2^m-1$; in other words, when we
1340
- get to $10$ predictors we have over * one thousand* models to train, and
1341
- at $20$ predictors we have over * one million* models to train!
1340
+ get to 10 predictors we have over * one thousand* models to train, and
1341
+ at 20 predictors we have over * one million* models to train!
1342
1342
So although it is a simple method, best subset selection is usually too computationally
1343
1343
expensive to use in practice.
1344
1344
@@ -1360,8 +1360,8 @@ This pattern continues for as many iterations as you want. If you run the method
1360
1360
all the way until you run out of predictors to choose, you will end up training
1361
1361
$\frac{1}{2}m(m+1)$ separate models. This is a * big* improvement from the $2^m-1$
1362
1362
models that best subset selection requires you to train! For example, while best subset selection requires
1363
- training over 1000 candidate models with $m=10$ predictors, forward selection requires training only 55 candidate models.
1364
- Therefore we will continue the rest of this section using forward selection.
1363
+ training over 1000 candidate models with 10 predictors, forward selection requires training only 55 candidate models.
1364
+ Therefore we will continue the rest of this section using forward selection.
1365
1365
1366
1366
> ** Note:** One word of caution before we move on. Every additional model that you train
1367
1367
> increases the likelihood that you will get unlucky and stumble
0 commit comments