diff --git a/notebooks/datasets_adult_census.ipynb b/notebooks/datasets_adult_census.ipynb index 139287829..ae274ebf5 100644 --- a/notebooks/datasets_adult_census.ipynb +++ b/notebooks/datasets_adult_census.ipynb @@ -105,7 +105,7 @@ " dimensions=plot_list,\n", " )\n", ")\n", - "fig.show()" + "fig.show(renderer=\"notebook\")" ] }, { diff --git a/notebooks/linear_models_feature_engineering_classification.ipynb b/notebooks/linear_models_feature_engineering_classification.ipynb index 6781ef734..5043c40ad 100644 --- a/notebooks/linear_models_feature_engineering_classification.ipynb +++ b/notebooks/linear_models_feature_engineering_classification.ipynb @@ -641,7 +641,7 @@ "- Transformers such as `KBinsDiscretizer` and `SplineTransformer` can be used\n", " to engineer non-linear features independently for each original feature.\n", "- As a result, these transformers cannot capture interactions between the\n", - " orignal features (and then would fail on the XOR classification task).\n", + " original features (and then would fail on the XOR classification task).\n", "- Despite this limitation they already augment the expressivity of the\n", " pipeline, which can be sufficient for some datasets.\n", "- They also favor axis-aligned decision boundaries, in particular in the low\n", diff --git a/notebooks/parameter_tuning_grid_search.ipynb b/notebooks/parameter_tuning_grid_search.ipynb index 7f0e0f61f..a7fc56994 100644 --- a/notebooks/parameter_tuning_grid_search.ipynb +++ b/notebooks/parameter_tuning_grid_search.ipynb @@ -198,29 +198,33 @@ "source": [ "## Tuning using a grid-search\n", "\n", - "In the previous exercise we used one `for` loop for each hyperparameter to\n", - "find the best combination over a fixed grid of values. `GridSearchCV` is a\n", - "scikit-learn class that implements a very similar logic with less repetitive\n", - "code.\n", + "In the previous exercise (M3.01) we used two nested `for` loops (one for each\n", + "hyperparameter) to test different combinations over a fixed grid of\n", + "hyperparameter values. In each iteration of the loop, we used\n", + "`cross_val_score` to compute the mean score (as averaged across\n", + "cross-validation splits), and compared those mean scores to select the best\n", + "combination. `GridSearchCV` is a scikit-learn class that implements a very\n", + "similar logic with less repetitive code. The suffix `CV` refers to the\n", + "cross-validation it runs internally (instead of the `cross_val_score` we\n", + "\"hard\" coded).\n", + "\n", + "The `GridSearchCV` estimator takes a `param_grid` parameter which defines all\n", + "hyperparameters and their associated values. The grid-search is in charge of\n", + "creating all possible combinations and testing them.\n", + "\n", + "The number of combinations is equal to the product of the number of values to\n", + "explore for each parameter. Thus, adding new parameters with their associated\n", + "values to be explored rapidly becomes computationally expensive. Because of\n", + "that, here we only explore the combination learning-rate and the maximum\n", + "number of nodes for a total of 4 x 3 = 12 combinations.\n", "\n", - "Let's see how to use the `GridSearchCV` estimator for doing such search. Since\n", - "the grid-search is costly, we only explore the combination learning-rate and\n", - "the maximum number of nodes." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ "%%time\n", "from sklearn.model_selection import GridSearchCV\n", "\n", "param_grid = {\n", - " \"classifier__learning_rate\": (0.01, 0.1, 1, 10),\n", - " \"classifier__max_leaf_nodes\": (3, 10, 30),\n", - "}\n", + " \"classifier__learning_rate\": (0.01, 0.1, 1, 10), # 4 possible values\n", + " \"classifier__max_leaf_nodes\": (3, 10, 30), # 3 possible values\n", + "} # 12 unique combinations\n", "model_grid_search = GridSearchCV(model, param_grid=param_grid, n_jobs=2, cv=2)\n", "model_grid_search.fit(data_train, target_train)" ] @@ -229,7 +233,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Finally, we check the accuracy of our model using the test set." + "You can access the best combination of hyperparameters found by the grid\n", + "search using the `best_params_` attribute." ] }, { @@ -238,46 +243,19 @@ "metadata": {}, "outputs": [], "source": [ - "accuracy = model_grid_search.score(data_test, target_test)\n", - "print(\n", - " f\"The test accuracy score of the grid-searched pipeline is: {accuracy:.2f}\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
Warning
\n", - "Be aware that the evaluation should normally be performed through\n", - "cross-validation by providing model_grid_search as a model to the\n", - "cross_validate function.
\n", - "Here, we used a single train-test split to evaluate model_grid_search. In\n", - "a future notebook will go into more detail about nested cross-validation, when\n", - "you use cross-validation both for hyperparameter tuning and model evaluation.
\n", - "Note
\n", + "This figure shows the particular case of K-fold cross-validation strategy\n", + "using n_splits=5 to further split the train set coming from a train-test\n", + "split. For each cross-validation split, the procedure trains a model on all\n", + "the red samples, evaluates the score of a given set of hyperparameters on the\n", + "green samples. The best combination of hyperparameters best_params is selected\n", + "based on those intermediate scores.
\n", + "Then a final model is refitted using best_params on the concatenation of the\n", + "red and green samples and evaluated on the blue samples.
\n", + "The green samples are sometimes referred as the validation set to\n", + "differentiate them from the final test set in blue.
\n", + "