-
Notifications
You must be signed in to change notification settings - Fork 183
[testing, CI] fix coverage statistics issue caused by test_common.py tracer patching
#2237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 14 commits
3a63d36
1346532
7545008
8179a90
244004c
5c602da
8faac56
ff5febf
a0aeba2
0c56de6
626315d
4f2c233
bf1cb47
20a66ca
d68b979
9de1f51
6eaec9a
c285913
e88e32d
17d6c89
01198f0
a639a1a
9ed8ac8
c53040f
8d28702
b00c969
379b2dd
c8275c0
8569186
a74de05
335c179
56267b6
7bb1c6b
9696c63
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -15,12 +15,15 @@ | |||||||||||
| # ============================================================================== | ||||||||||||
|
|
||||||||||||
| import importlib.util | ||||||||||||
| import io | ||||||||||||
| import os | ||||||||||||
| import pathlib | ||||||||||||
| import pkgutil | ||||||||||||
| import re | ||||||||||||
| import sys | ||||||||||||
| import trace | ||||||||||||
| from contextlib import redirect_stdout | ||||||||||||
| from multiprocessing import Pipe, Process, get_context | ||||||||||||
|
|
||||||||||||
| import pytest | ||||||||||||
| from sklearn.utils import all_estimators | ||||||||||||
|
|
@@ -225,23 +228,120 @@ def _commonpath(inp): | |||||||||||
| _TRACE_BLOCK_LIST = _whitelist_to_blacklist() | ||||||||||||
|
|
||||||||||||
|
|
||||||||||||
| def sklearnex_trace(estimator, method): | ||||||||||||
| """Generate a trace of all function calls in calling estimator.method. | ||||||||||||
|
|
||||||||||||
| Parameters | ||||||||||||
| ---------- | ||||||||||||
| estimator : str | ||||||||||||
icfaust marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||
| name of estimator which is a key from PATCHED_MODELS or SPECIAL_INSTANCES | ||||||||||||
|
|
||||||||||||
| method : str | ||||||||||||
icfaust marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||||||
| name of estimator method which is to be traced and stored | ||||||||||||
|
|
||||||||||||
| Returns | ||||||||||||
| ------- | ||||||||||||
| text: str | ||||||||||||
| Returns a string output (captured stdout of a python Trace call). It is a | ||||||||||||
| modified version to be more informative, completed by a monkeypatching | ||||||||||||
| of trace._modname. | ||||||||||||
| """ | ||||||||||||
| # get estimator | ||||||||||||
| try: | ||||||||||||
| est = PATCHED_MODELS[estimator]() | ||||||||||||
| except KeyError: | ||||||||||||
| est = SPECIAL_INSTANCES[estimator] | ||||||||||||
|
||||||||||||
| try: | |
| est = PATCHED_MODELS[estimator]() | |
| except KeyError: | |
| est = SPECIAL_INSTANCES[estimator] | |
| estimator = (SPECIAL_INSTANCES | PATCHED_MODELS)[estimator_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to revert this, because SPECIAL_INSTANCES is a special dictionary of estimators which uses sklearn's clone (in order to guarantee that there is no hysteresis between uses of the instance). And the patched models are classes. To be honest, the difference is tech debt that I introduced at the beginning of 2024, as I was trying to unify the centralized testing. Hindsight I would structure things like SPECIAL_INSTANCES.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'd go with PATCHED_MODELS.get(estimator_name, None) or SPECIAL_INSTANCES[estimator_name]. I don't want to waste 4 lines on something that doesn't contribute to the function logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me some time to figure this out, turns out the ensemble algorithms of sklearn break the suggestion construction.
from sklearn.ensemble import RandomForestRegressor
RandomForestRegressor() or 3will yield:
AttributeError: 'RandomForestRegressor' object has no attribute 'estimators_'. Did you mean: 'estimator'?
which means I cannot use this in this case, its definitely doing something with the or operator and checking if its an iterable. Its not something on our side, but comes from sklearn conformance. It comes from sklearn for whatever reason defining a __len__ for ensemble estimators thats only valid after fitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for sending you down a rabbit hole. I would still prefer a different implementation because I think try/except is a bit of an overkill
estimator = PATCHED_MODELS[estimator_name] if estimator_name in PATCHED_MODELS else SPECIAL_INSTANCES[estimator_name]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no apologies necessary, it showed i hadnt handled a failure case properly and would lead to pytest hanging, which would have been a nightmare to debug. now it should error 'gracefully' in CI (testing it now)
Uh oh!
There was an error while loading. Please reload this page.