enable mypy in examples and tests dirs by dasmy · Pull Request #361 · emdgroup/octopus

dasmy · 2026-03-12T13:40:29Z

Fixes #359

Copilot

Pull request overview

Enables mypy type-checking for examples/ and tests/ (per #359) by removing those directories from mypy/pre-commit exclusions and applying typing-focused adjustments across tests and a few core typing surfaces.

Changes:

Enable mypy on examples/ and tests/ via .pre-commit-config.yaml and pyproject.toml.
Update tests/examples with additional annotations, type: ignore markers, and small refactors to satisfy mypy.
Add Models.get_registered_models() and adjust some core typing (e.g., workflow/TrainingConfig types) to support stricter checking.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
tests/workflows/test_roc_octo_roc_workflow.py	Typing-driven refactors and added `None`-guards for `models`.
tests/workflows/test_octo_t2e.py	Adjust assertions/typing and add `models is not None` guard.
tests/workflows/test_octo_regression.py	Make `make_regression` return shape explicit and add `models` guards.
tests/workflows/test_octo_multiclass.py	Add `models is not None` guards.
tests/workflows/test_octo_classification.py	Add `models is not None` guards.
tests/workflows/test_ag_workflows.py	Make `make_regression` return shape explicit.
tests/study/test_core.py	Refactor kwargs construction to satisfy mypy call typing.
tests/predict/test_predict.py	Add (currently inaccurate) type annotation + ignore for sklearn dataset.
tests/modules/test_training_feature_importances.py	Add stronger typing, use model registry accessor, restructure configs.
tests/modules/test_mrmr.py	Tighten correlation-type typing and adjust `make_regression` usage.
tests/modules/test_module_result.py	Add `None`-guards and use `UPath.open()`.
tests/modules/roc/test_roc.py	Add mypy ignore for intentionally invalid arg type in negative test.
tests/modules/octo/test_model_fitted_validation.py	Switch to `ModelName`, use registry accessor, typing cleanups.
tests/modules/octo/test_ensemble_selection_unit.py	Use `ModelName` for `config_training`.
tests/modules/octo/test_ensemble_selection.py	Add typing for model map and `ModelName` mapping.
tests/modules/octo/test_classes_attribute.py	Use `Models.get_registered_models()` instead of private registry.
tests/models/test_model_parameter_compatibility.py	Type model names as `ModelName` and adjust sorting.
tests/models/test_model_config.py	Use `BaseModel`-derived dummy model to satisfy typing expectations.
tests/models/test_hyperparameter.py	Switch to `ModelName` (introduces a runtime issue for `"TestModel"`).
tests/metrics/test_metrics_uniqueness.py	Minor typing improvements and cleanup.
tests/manager/test_core.py	Add mypy ignore for intentional frozen-instance mutation.
tests/infrastructure/test_file_io.py	Adjust categorical dtype construction for typing (introduces nondeterminism risk).
pyproject.toml	Remove `examples`/`tests` from mypy `exclude`.
octopus/study/validation.py	Widen workflow validator type to `Sequence[Task]` and message formatting.
octopus/study/core.py	Change `workflow` annotation to `Sequence[Task]` (validator still enforces `list`).
octopus/modules/octo/training.py	Make `TrainingConfig` `TypedDict(total=False)` to relax typing constraints.
octopus/models/core.py	Add `Models.get_registered_models()` and error-message formatting.
octopus/manager/workflow_runner.py	Change `workflow` annotation to `Sequence[Task]` (validator still enforces `list`).
octopus/manager/core.py	Change `workflow` annotation to `Sequence[Task]` (validator still enforces `list`).
examples/wf_roc_octo.py	Add `Bunch` annotation and mypy ignore for sklearn dataset loader typing.
examples/wf_multiclass_wine.py	Add `Bunch` annotation and mypy ignore for sklearn dataset loader typing.
examples/use_own_hyperparameters.py	Add `Bunch` annotation and mypy ignore for sklearn dataset loader typing.
examples/multi_workflow.py	Add `Bunch` annotation and mypy ignore for sklearn dataset loader typing.
examples/basic_regression.py	Add `Bunch` annotation and mypy ignore for sklearn dataset loader typing.
examples/basic_classification.py	Add `Bunch` annotation and mypy ignore for sklearn dataset loader typing.
.pre-commit-config.yaml	Enable mypy hook for `examples/` and `tests/` by narrowing exclusions.

Comments suppressed due to low confidence (1)

octopus/modules/octo/training.py:43

Changing TrainingConfig to TypedDict(total=False) makes all keys optional, but Training still indexes required keys directly (e.g. self.config_training["outl_reduction"], ml_model_type, ml_model_params), which can now slip past type-checking and fail with KeyError. Prefer a TypedDict with required keys and NotRequired[...] only for truly optional entries (e.g. positive_class).

class TrainingConfig(TypedDict, total=False):
    """Training configuration type."""

    outl_reduction: int
    n_input_features: int
    ml_model_type: ModelName
    ml_model_params: dict
    positive_class: int | None

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

octopus/manager/core.py

tests/infrastructure/test_file_io.py

tests/workflows/test_roc_octo_roc_workflow.py

octopus/manager/workflow_runner.py

tests/models/test_hyperparameter.py

tests/predict/test_predict.py

tests/workflows/test_roc_octo_roc_workflow.py

octopus/study/core.py

…ed synchronously on different branches

examples/basic_classification.py

examples/basic_regression.py

octopus/manager/workflow_runner.py

nihaase · 2026-03-12T19:40:31Z

octopus/models/core.py

        return decorator

+    @classmethod
+    def get_registered_models(cls) -> list[ModelName]:


Do we want this function? In the code you normally only want models that fit to your ml_type. Can be a potential error if used somewhere.

Should we make it private by adding a leading underscore, @nihaase ?

tests/modules/test_mrmr.py

Copilot

Pull request overview

Copilot reviewed 39 out of 39 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

octopus/modules/octo/training.py:43

Changing TrainingConfig to TypedDict(total=False) makes all keys optional, but Training accesses several of them with [...] (e.g. outl_reduction, ml_model_type, ml_model_params), which are required at runtime. This weakens type safety and can hide missing-key bugs. Prefer keeping required keys as required (use Required[...]/NotRequired[...] from PEP 655, or split into required/optional TypedDicts) and update tests to provide the required keys.

class TrainingConfig(TypedDict, total=False):
    """Training configuration type."""

    outl_reduction: int
    n_input_features: int
    ml_model_type: ModelName
    ml_model_params: dict
    positive_class: int | None

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

tests/models/test_hyperparameter.py

    IntHyperparameter,
 )
+from octopus.models.model_name import ModelName
 from octopus.types import MLType


tests/predict/test_predict.py

+    bc: Bunch = load_breast_cancer(as_frame=True)  # type: ignore[assignment]
    df = bc["frame"].reset_index()


octopus/example_data.py

+    breast_cancer: Bunch = load_breast_cancer(as_frame=True)  # type: ignore[assignment]
+
+    df = breast_cancer["frame"].reset_index()


octopus/example_data.py

+    diabetes: Bunch = load_diabetes(as_frame=True)  # type: ignore[assignment]
+
+    df = diabetes["frame"].reset_index()
+    features = [str(feature) for feature in diabetes["feature_names"]]


octopus/example_data.py

+    wine: Bunch = load_wine(as_frame=True)  # type: ignore[assignment]
+
+    df = wine["frame"].reset_index()
+    df.columns = df.columns.str.replace(" ", "_")


tests/modules/octo/test_ensemble_selection_unit.py

        max_features=4,
        feature_groups={},
-        config_training={"ml_model_type": "RidgeRegressor"},
+        config_training={"ml_model_type": ModelName.RidgeRegressor},


anwurl · 2026-03-14T17:47:14Z

octopus/manager/workflow_runner.py

-    workflow: list[Task] = field(validator=[validators.instance_of(list)])
+    workflow: Sequence[Task] = field(validator=[validators.instance_of(list)])


I don't understand why we have this mismatch: Sequence -- list

anwurl · 2026-03-14T17:49:57Z

octopus/example_data.py

+def load_breast_cancer_data() -> tuple[pd.DataFrame, list[str], list[str]]:
+    """Load the breast cancer dataset and return pandas dataframe, feature list, and target list."""
+    breast_cancer: Bunch = load_breast_cancer(as_frame=True)  # type: ignore[assignment]
+
+    df = breast_cancer["frame"].reset_index()
+    df.columns = df.columns.str.replace(" ", "_")
+    features = [feature.replace(" ", "_") for feature in breast_cancer["feature_names"]]
+    targets = [str(target) for target in breast_cancer["target_names"]]
+
+    return df, features, targets


Should we put data code in a separate folder?

dasmy force-pushed the fix/359 branch from 4b14808 to 9938d03 Compare March 12, 2026 13:42

dasmy requested a review from Copilot March 12, 2026 13:42

Copilot started reviewing on behalf of dasmy March 12, 2026 13:42 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

dasmy force-pushed the fix/359 branch from 9938d03 to a91c81e Compare March 12, 2026 15:00

dasmy marked this pull request as ready for review March 12, 2026 15:00

enable mypy in examples and tests dirs

f0bb644

dasmy force-pushed the fix/359 branch from a91c81e to f0bb644 Compare March 12, 2026 15:06

fix: prevent docs build actions from cancelling each other when invok…

da8c4c4

…ed synchronously on different branches

nihaase reviewed Mar 12, 2026

View reviewed changes

move example data loading into octopus.example_data submodule

972d17d

anwurl requested a review from Copilot March 14, 2026 17:44

Copilot started reviewing on behalf of anwurl March 14, 2026 17:45 View session

Copilot AI reviewed Mar 14, 2026

View reviewed changes

anwurl approved these changes Mar 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable mypy in examples and tests dirs#361

enable mypy in examples and tests dirs#361
dasmy wants to merge 3 commits intomainfrom
fix/359

dasmy commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nihaase Mar 12, 2026

Uh oh!

dasmy Mar 12, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

anwurl Mar 14, 2026

Uh oh!

anwurl Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		bc: Bunch = load_breast_cancer(as_frame=True) # type: ignore[assignment]
		df = bc["frame"].reset_index()

		breast_cancer: Bunch = load_breast_cancer(as_frame=True) # type: ignore[assignment]

		df = breast_cancer["frame"].reset_index()

		workflow: list[Task] = field(validator=[validators.instance_of(list)])
		workflow: Sequence[Task] = field(validator=[validators.instance_of(list)])

Conversation

dasmy commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nihaase Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

dasmy Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

anwurl Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

anwurl Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants