Skip to content

MTN Heat map explanation#833

Merged
ArturoAmorQ merged 7 commits intoINRIA:mainfrom
SebastienMelo:heat-map-explanation
Jul 9, 2025
Merged

MTN Heat map explanation#833
ArturoAmorQ merged 7 commits intoINRIA:mainfrom
SebastienMelo:heat-map-explanation

Conversation

@SebastienMelo
Copy link
Contributor

Linked to issue #530 , gives a quick explanation of how to understand the heatmap.

Copy link
Collaborator

@ArturoAmorQ ArturoAmorQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @SebastienMelo thanks for the PR! I have the feeling that the whole narrative starting in current line 220 is a bit odd, as if the person writing it had forgotten what was said at each new paragraph. This is sometimes the result of several local contributions that lose the global perspective. In this case I suggest taking the opportunity of your PR to fix the issue. My suggestion in concrete is:

  • Changing lines 220-221 to something in the lines of:

Given that we have only 2 parameters, we can visualize the results of the grid search as a heatmap. To do so, we first need to reshape the cv_results into a dataframe where:
[...]

  • Change line 237 to something similar to:

Now that we have the correct format, we can create a heatmap as follows:
[...]

  • Then for the comment you added, feel free to rephrase my proposed wording for better clarity.

Comment on lines +253 to +256
# This heatmap shows the values of mean test accuracy of the previous table. The
# color of each cell indicates the mean accuracy of the model for a given
# combination of hyperparameters. The darker the color, the better the accuracy.
#
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something more explicit in the lines of

The heatmap above shows the mean test accuracy (i.e., the average over cross-validation splits) for each combination of hyperparameters, where darker colors indicate better performance. However, notice that using colors only allows us to visually compare the mean test score, but does not carry any information on the standard deviation over splits, making it difficult to say if different scores coming from different combinations lead to a significantly better model or not.

@SebastienMelo
Copy link
Contributor Author

@ArturoAmorQ added the fixes, thank you for your comment!

Copy link
Collaborator

@ArturoAmorQ ArturoAmorQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SebastienMelo! Merging!

@ArturoAmorQ ArturoAmorQ merged commit 5be3814 into INRIA:main Jul 9, 2025
3 checks passed
github-actions bot pushed a commit that referenced this pull request Jul 9, 2025
SebastienMelo added a commit to SebastienMelo/scikit-learn-mooc that referenced this pull request Jul 16, 2025
ArturoAmorQ pushed a commit to ArturoAmorQ/scikit-learn-mooc that referenced this pull request Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants