|
84 | 84 | "text": [ |
85 | 85 | "5-fold cross validation scores:\n", |
86 | 86 | "\n", |
87 | | - "R^2 Score: 0.45 (+/- 0.29) [SVM]\n", |
| 87 | + "R^2 Score: 0.46 (+/- 0.29) [SVM]\n", |
88 | 88 | "R^2 Score: 0.43 (+/- 0.14) [Lasso]\n", |
89 | | - "R^2 Score: 0.52 (+/- 0.28) [Random Forest]\n", |
90 | | - "R^2 Score: 0.58 (+/- 0.24) [StackingCVRegressor]\n" |
| 89 | + "R^2 Score: 0.53 (+/- 0.28) [Random Forest]\n", |
| 90 | + "R^2 Score: 0.58 (+/- 0.23) [StackingCVRegressor]\n" |
91 | 91 | ] |
92 | 92 | } |
93 | 93 | ], |
|
138 | 138 | "text": [ |
139 | 139 | "5-fold cross validation scores:\n", |
140 | 140 | "\n", |
141 | | - "Neg. MSE Score: -33.69 (+/- 22.36) [SVM]\n", |
| 141 | + "Neg. MSE Score: -33.34 (+/- 22.36) [SVM]\n", |
142 | 142 | "Neg. MSE Score: -35.53 (+/- 16.99) [Lasso]\n", |
143 | | - "Neg. MSE Score: -27.32 (+/- 16.62) [Random Forest]\n", |
144 | | - "Neg. MSE Score: -25.64 (+/- 18.11) [StackingCVRegressor]\n" |
| 143 | + "Neg. MSE Score: -27.25 (+/- 16.76) [Random Forest]\n", |
| 144 | + "Neg. MSE Score: -25.56 (+/- 18.22) [StackingCVRegressor]\n" |
145 | 145 | ] |
146 | 146 | } |
147 | 147 | ], |
|
177 | 177 | "source": [ |
178 | 178 | "In this second example we demonstrate how `StackingCVRegressor` works in combination with `GridSearchCV`. The stack still allows tuning hyper parameters of the base and meta models!\n", |
179 | 179 | "\n", |
180 | | - "To set up a parameter grid for scikit-learn's `GridSearch`, we simply provide the estimator's names in the parameter grid -- in the special case of the meta-regressor, we append the `'meta-'` prefix.\n" |
| 180 | + "For instance, we can use `estimator.get_params().keys()` to get a full list of tunable parameters.\n" |
181 | 181 | ] |
182 | 182 | }, |
183 | 183 | { |
184 | 184 | "cell_type": "code", |
185 | 185 | "execution_count": 3, |
186 | 186 | "metadata": {}, |
187 | 187 | "outputs": [ |
| 188 | + { |
| 189 | + "name": "stderr", |
| 190 | + "output_type": "stream", |
| 191 | + "text": [ |
| 192 | + "/Users/guq/miniconda3/envs/python3/lib/python3.7/site-packages/sklearn/model_selection/_search.py:841: DeprecationWarning: The default of the `iid` parameter will change from True to False in version 0.22 and will be removed in 0.24. This will change numeric results when test-set sizes are unequal.\n", |
| 193 | + " DeprecationWarning)\n" |
| 194 | + ] |
| 195 | + }, |
188 | 196 | { |
189 | 197 | "name": "stdout", |
190 | 198 | "output_type": "stream", |
191 | 199 | "text": [ |
192 | | - "Best: 0.673590 using {'lasso__alpha': 0.4, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.3}\n" |
| 200 | + "Best: 0.674237 using {'lasso__alpha': 1.6, 'meta_regressor__n_estimators': 100, 'ridge__alpha': 0.2}\n" |
193 | 201 | ] |
194 | 202 | } |
195 | 203 | ], |
|
203 | 211 | "\n", |
204 | 212 | "X, y = load_boston(return_X_y=True)\n", |
205 | 213 | "\n", |
206 | | - "ridge = Ridge()\n", |
207 | | - "lasso = Lasso()\n", |
| 214 | + "ridge = Ridge(random_state=RANDOM_SEED)\n", |
| 215 | + "lasso = Lasso(random_state=RANDOM_SEED)\n", |
208 | 216 | "rf = RandomForestRegressor(random_state=RANDOM_SEED)\n", |
209 | 217 | "\n", |
210 | 218 | "# The StackingCVRegressor uses scikit-learn's check_cv\n", |
|
224 | 232 | " param_grid={\n", |
225 | 233 | " 'lasso__alpha': [x/5.0 for x in range(1, 10)],\n", |
226 | 234 | " 'ridge__alpha': [x/20.0 for x in range(1, 10)],\n", |
227 | | - " 'meta-randomforestregressor__n_estimators': [10, 100]\n", |
| 235 | + " 'meta_regressor__n_estimators': [10, 100]\n", |
228 | 236 | " }, \n", |
229 | 237 | " cv=5,\n", |
230 | 238 | " refit=True\n", |
|
244 | 252 | "name": "stdout", |
245 | 253 | "output_type": "stream", |
246 | 254 | "text": [ |
247 | | - "0.622 +/- 0.10 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.05}\n", |
248 | | - "0.649 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.1}\n", |
249 | | - "0.650 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.15}\n", |
250 | | - "0.667 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.2}\n", |
251 | | - "0.629 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.25}\n", |
252 | | - "0.663 +/- 0.08 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.3}\n", |
253 | | - "0.633 +/- 0.08 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.35}\n", |
254 | | - "0.637 +/- 0.08 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.4}\n", |
255 | | - "0.649 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.45}\n", |
256 | | - "0.653 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 100, 'ridge__alpha': 0.05}\n", |
257 | | - "0.648 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 100, 'ridge__alpha': 0.1}\n", |
258 | | - "0.645 +/- 0.09 {'lasso__alpha': 0.2, 'meta-randomforestregressor__n_estimators': 100, 'ridge__alpha': 0.15}\n", |
| 255 | + "0.616 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.05}\n", |
| 256 | + "0.656 +/- 0.08 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.1}\n", |
| 257 | + "0.653 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.15}\n", |
| 258 | + "0.669 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.2}\n", |
| 259 | + "0.632 +/- 0.08 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.25}\n", |
| 260 | + "0.664 +/- 0.08 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.3}\n", |
| 261 | + "0.632 +/- 0.08 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.35}\n", |
| 262 | + "0.642 +/- 0.08 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.4}\n", |
| 263 | + "0.653 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 10, 'ridge__alpha': 0.45}\n", |
| 264 | + "0.657 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 100, 'ridge__alpha': 0.05}\n", |
| 265 | + "0.650 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 100, 'ridge__alpha': 0.1}\n", |
| 266 | + "0.648 +/- 0.09 {'lasso__alpha': 0.2, 'meta_regressor__n_estimators': 100, 'ridge__alpha': 0.15}\n", |
259 | 267 | "...\n", |
260 | | - "Best parameters: {'lasso__alpha': 0.4, 'meta-randomforestregressor__n_estimators': 10, 'ridge__alpha': 0.3}\n", |
| 268 | + "Best parameters: {'lasso__alpha': 1.6, 'meta_regressor__n_estimators': 100, 'ridge__alpha': 0.2}\n", |
261 | 269 | "Accuracy: 0.67\n" |
262 | 270 | ] |
263 | 271 | } |
|
284 | 292 | "source": [ |
285 | 293 | "**Note**\n", |
286 | 294 | "\n", |
287 | | - "The `StackingCVRegressor` also enables grid search over the `regressors` argument. However, due to the current implementation of `GridSearchCV` in scikit-learn, it is not possible to search over both, different regressors and regressor parameters at the same time. For instance, while the following parameter dictionary works\n", |
| 295 | + "The `StackingCVRegressor` also enables grid search over the `regressors` and even a single base regressor. When there are level-mixed hyperparameters, `GridSearchCV` will try to replace hyperparameters in a top-down order, i.e., `regressors` -> single base regressor -> regressor hyperparameter. For instance, given a hyperparameter grid such as\n", |
288 | 296 | "\n", |
289 | 297 | " params = {'randomforestregressor__n_estimators': [1, 100],\n", |
290 | 298 | " 'regressors': [(regr1, regr1, regr1), (regr2, regr3)]}\n", |
291 | 299 | " \n", |
292 | | - "it will use the instance settings of `regr1`, `regr2`, and `regr3` and not overwrite it with the `'n_estimators'` settings from `'randomforestregressor__n_estimators': [1, 100]`." |
| 300 | + "it will first use the instance settings of either `(regr1, regr2, regr3)` or `(regr2, regr3)` . Then it will replace the `'n_estimators'` settings for a matching regressor based on `'randomforestregressor__n_estimators': [1, 100]`." |
293 | 301 | ] |
294 | 302 | }, |
295 | 303 | { |
|
605 | 613 | "name": "python", |
606 | 614 | "nbconvert_exporter": "python", |
607 | 615 | "pygments_lexer": "ipython3", |
608 | | - "version": "3.6.6" |
| 616 | + "version": "3.7.1" |
609 | 617 | }, |
610 | 618 | "toc": { |
611 | 619 | "nav_menu": {}, |
|
0 commit comments