Conversation
…o heat-map-explanation
ArturoAmorQ
left a comment
There was a problem hiding this comment.
Hi @SebastienMelo thanks for the PR! I have the feeling that the whole narrative starting in current line 220 is a bit odd, as if the person writing it had forgotten what was said at each new paragraph. This is sometimes the result of several local contributions that lose the global perspective. In this case I suggest taking the opportunity of your PR to fix the issue. My suggestion in concrete is:
- Changing lines 220-221 to something in the lines of:
Given that we have only 2 parameters, we can visualize the results of the grid search as a heatmap. To do so, we first need to reshape the
cv_resultsinto a dataframe where:
[...]
- Change line 237 to something similar to:
Now that we have the correct format, we can create a heatmap as follows:
[...]
- Then for the comment you added, feel free to rephrase my proposed wording for better clarity.
| # This heatmap shows the values of mean test accuracy of the previous table. The | ||
| # color of each cell indicates the mean accuracy of the model for a given | ||
| # combination of hyperparameters. The darker the color, the better the accuracy. | ||
| # |
There was a problem hiding this comment.
How about something more explicit in the lines of
The heatmap above shows the mean test accuracy (i.e., the average over cross-validation splits) for each combination of hyperparameters, where darker colors indicate better performance. However, notice that using colors only allows us to visually compare the mean test score, but does not carry any information on the standard deviation over splits, making it difficult to say if different scores coming from different combinations lead to a significantly better model or not.
|
@ArturoAmorQ added the fixes, thank you for your comment! |
ArturoAmorQ
left a comment
There was a problem hiding this comment.
Thanks @SebastienMelo! Merging!
Linked to issue #530 , gives a quick explanation of how to understand the heatmap.