feat(skore): Add cross-validation support for permutation importance#2370
feat(skore): Add cross-validation support for permutation importance#2370glemaitre wants to merge 11 commits intoprobabl-ai:mainfrom
Conversation
skore/src/skore/_sklearn/_cross_validation/inspection_accessor.py
Outdated
Show resolved
Hide resolved
| row, col, hue = ( | ||
| subplot_cols[0], | ||
| subplot_cols[1], | ||
| (next(iter(remaining)) if remaining else None), |
There was a problem hiding this comment.
nit
| (next(iter(remaining)) if remaining else None), | |
| next(iter(remaining), None), |
| case 1: | ||
| col, row, hue = ( | ||
| subplot_cols[0], | ||
| None, | ||
| (next(iter(remaining)) if remaining else None), | ||
| ) | ||
| col, row = subplot_by, None | ||
| if remaining_column := set(columns_to_groupby) - {subplot_by}: | ||
| hue = next(iter(remaining_column)) | ||
| else: | ||
| hue = None | ||
| else: | ||
| if not all(item in columns_to_groupby for item in subplot_by): | ||
| case 2: | ||
| row, col, hue = ( | ||
| subplot_cols[0], | ||
| subplot_cols[1], | ||
| (next(iter(remaining)) if remaining else None), |
There was a problem hiding this comment.
Both row and hue are the same
| f"Permutation importance {aggregate_title} \nof {estimator_name} " | ||
| f"on {data_source} set" |
There was a problem hiding this comment.
| f"Permutation importance {aggregate_title} \nof {estimator_name} " | |
| f"on {data_source} set" | |
| f"Permutation importance {aggregate_title}\n" | |
| f"of {estimator_name} on {data_source} set" |
There was a problem hiding this comment.
Some first remarks from a usage perspective.
Also, the shaded background is very faint. I propose increasing the alpha from 0.1 to 0.4.
Opened #2428 to do it separately.
| metric : str | ||
| Metric to plot. | ||
|
|
There was a problem hiding this comment.
Let's give it a None default and raise an error asking to choose a metric when multiple metrics were passed to the constructor. Its counter intuitive to have to choose a metric when I did not specify any in the constructor.
| if frame["label"].isna().all(): | ||
| # regression problem or averaged classification metric | ||
| columns_to_drop.append("label") |
There was a problem hiding this comment.
When using an averaged metric in multiclass classification, the label column is dropped here. Then, plotting with subplot_by="label" raises : "The column(s) ['label'] are not available." which was confusing to me.
Should we create a specific error that says something like You are using a metric averaged over labels, subplot_by="label" is not available to make it less confusing ?
|
Taking over as agreed with @glemaitre. |
Partially addressing #1780
This PR adding support for
CrossValidationReportin thePermutationImportanceDisplay.