Refactor Feature Importances, fix shap multiclass, fix T2E FIs#362
Refactor Feature Importances, fix shap multiclass, fix T2E FIs#362
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors feature-importance implementations by centralizing shared computation logic into octopus.predict.feature_importance “Layer 1” primitives, then reusing those primitives from both training and predict code paths to reduce duplication.
Changes:
- Added shared FI primitives (
compute_per_repeat_stats,compute_internal_fi,compute_permutation_single,compute_shap_single) and refactored predict FI orchestrators to use them. - Refactored
Trainingpermutation/internal/SHAP FI methods to delegate into the shared primitives; removed the legacycalculate_fi_group_permutationentry points. - Updated bag + tests to call the unified permutation FI implementation and added a no-groups test variant.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| tests/modules/test_training_feature_importances.py | Updates FI method coverage to use unified permutation method and adds a “no groups” permutation variant. |
| octopus/predict/feature_importance.py | Introduces shared FI primitives and refactors predict FI orchestrators to use them. |
| octopus/modules/octo/training.py | Removes duplicated FI logic and delegates internal/permutation/SHAP FI to shared primitives. |
| octopus/modules/octo/bag.py | Switches from the removed group-permutation method to the unified permutation method. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # Build target_assignments from target_col | ||
| target_assignments = {target_col: target_col} | ||
|
|
| # KNOWN ISSUE: CatBoost multiclass is disabled because shap.Explainer(model, bg) | ||
| # segfaults in SHAP <=0.51 when using TreeExplainer's interventional mode with | ||
| # CatBoost multiclass models. Re-enable once SHAP fixes this upstream. | ||
| # See datasets_local/specifications_refactorfi/03_shap_catboost_segfault_proposal.md | ||
| # Original: ml_types=[MLType.BINARY, MLType.MULTICLASS], | ||
| ml_types=[MLType.BINARY], |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| training = _create_training_instance( | ||
| data_train, data_dev, data_test, ml_type, model_name, feature_cols, feature_groups | ||
| ) | ||
| training.fit() | ||
|
|
| # KNOWN ISSUE: CatBoost multiclass is disabled because shap.Explainer(model, bg) | ||
| # segfaults in SHAP <=0.51 when using TreeExplainer's interventional mode with | ||
| # CatBoost multiclass models. Re-enable once SHAP fixes this upstream. | ||
| # See datasets_local/specifications_refactorfi/03_shap_catboost_segfault_proposal.md | ||
| # Original: ml_types=[MLType.BINARY, MLType.MULTICLASS], | ||
| ml_types=[MLType.BINARY], |
Uh oh!
There was an error while loading. Please reload this page.