Skip to content

Commit b06ecee

Browse files
Make Test Work Again After Ruff and Linter Changes (#1310)
* mark production tests * make production test run * fix test bug -1/N * add retry raise again after refactor * fix str dict representation * test: Fix non-writable home mocks * testing: not not a change * testing: trigger CI * typing: Update typing * ci: Update testing matrix * testing: Fixup run flow error check * ci: Manual dispatch, disable double testing * ci: Prevent further ci duplication * ci: Add concurrency checks to all * ci: Remove the max-parallel on test ci There are a lot less now and they cancel previous puhes in the same pr now so it shouldn't be a problem anymore * testing: Fix windows path generation * add pytest for server state * add assert cache state * some formatting * fix with cache fixture * finally remove th finally * doc: Fix link * update test matrix * doc: Update to just point to contributing * add linkcheck ignore for test server --------- Co-authored-by: eddiebergman <[email protected]>
1 parent 326bf0b commit b06ecee

34 files changed

+331
-116
lines changed

.github/workflows/dist.yaml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,23 @@
11
name: dist-check
22

3-
on: [push, pull_request]
3+
on:
4+
workflow_dispatch:
5+
6+
push:
7+
branches:
8+
- main
9+
- develop
10+
tags:
11+
- "v*.*.*"
12+
13+
pull_request:
14+
branches:
15+
- main
16+
- develop
17+
18+
concurrency:
19+
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
20+
cancel-in-progress: true
421

522
jobs:
623
dist:

.github/workflows/docs.yaml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,22 @@
11
name: Docs
2-
on: [pull_request, push]
2+
on:
3+
workflow_dispatch:
4+
5+
push:
6+
branches:
7+
- main
8+
- develop
9+
tags:
10+
- "v*.*.*"
11+
12+
pull_request:
13+
branches:
14+
- main
15+
- develop
16+
17+
concurrency:
18+
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
19+
cancel-in-progress: true
320

421
jobs:
522
build-and-deploy:

.github/workflows/pre-commit.yaml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,23 @@
11
name: pre-commit
22

3-
on: [push]
3+
on:
4+
workflow_dispatch:
5+
6+
push:
7+
branches:
8+
- main
9+
- develop
10+
tags:
11+
- "v*.*.*"
12+
13+
pull_request:
14+
branches:
15+
- main
16+
- develop
17+
18+
concurrency:
19+
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
20+
cancel-in-progress: true
421

522
jobs:
623
run-all-files:

.github/workflows/release_docker.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
name: release-docker
22

33
on:
4+
workflow_dispatch:
45
push:
56
branches:
67
- 'develop'
@@ -11,6 +12,10 @@ on:
1112
branches:
1213
- 'develop'
1314

15+
concurrency:
16+
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
17+
cancel-in-progress: true
18+
1419
jobs:
1520

1621
docker:

.github/workflows/test.yml

Lines changed: 35 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,19 @@
11
name: Tests
22

3-
on: [push, pull_request]
3+
on:
4+
workflow_dispatch:
5+
6+
push:
7+
branches:
8+
- main
9+
- develop
10+
tags:
11+
- "v*.*.*"
12+
13+
pull_request:
14+
branches:
15+
- main
16+
- develop
417

518
concurrency:
619
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
@@ -12,25 +25,34 @@ jobs:
1225
runs-on: ${{ matrix.os }}
1326
strategy:
1427
matrix:
15-
python-version: ["3.8", "3.9"]
16-
scikit-learn: ["0.21.2", "0.22.2", "0.23.1", "0.24"]
28+
python-version: ["3.8"]
29+
# TODO(eddiebergman): We should consider testing against newer version I guess...
30+
# We probably consider just having a `"1"` version to always test against latest
31+
scikit-learn: ["0.23.1", "0.24"]
1732
os: [ubuntu-latest]
18-
sklearn-only: ['true']
19-
exclude: # no scikit-learn 0.21.2 release for Python 3.8
20-
- python-version: 3.8
21-
scikit-learn: 0.21.2
33+
sklearn-only: ["true"]
34+
exclude: # no scikit-learn 0.23 release for Python 3.9
35+
- python-version: "3.9"
36+
scikit-learn: "0.23.1"
2237
include:
23-
- python-version: 3.8
38+
- os: ubuntu-latest
39+
python-version: "3.9"
40+
scikit-learn: "0.24"
41+
scipy: "1.10.0"
42+
sklearn-only: "true"
43+
# Include a code cov version
44+
- code-cov: true
45+
os: ubuntu-latest
46+
python-version: "3.8"
2447
scikit-learn: 0.23.1
25-
code-cov: true
2648
sklearn-only: 'false'
27-
os: ubuntu-latest
49+
# Include a windows test, for some reason on a later version of scikit-learn
2850
- os: windows-latest
29-
sklearn-only: 'false'
51+
python-version: "3.8"
3052
scikit-learn: 0.24.*
31-
scipy: 1.10.0
53+
scipy: "1.10.0" # not sure why the explicit scipy version?
54+
sklearn-only: 'false'
3255
fail-fast: false
33-
max-parallel: 4
3456

3557
steps:
3658
- uses: actions/checkout@v4

doc/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@
119119
#
120120
# currently disabled because without intersphinx we cannot link to numpy.ndarray
121121
# nitpicky = True
122-
122+
linkcheck_ignore = [r"https://test.openml.org/t/.*"] # FIXME: to avoid test server bugs avoiding docs building
123123
# -- Options for HTML output ----------------------------------------------
124124

125125
# The theme to use for HTML and HTML Help pages. See the documentation for

doc/contributing.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ In particular, a few ways to contribute to openml-python are:
1919
For more information, see the :ref:`extensions` below.
2020

2121
* Bug reports. If something doesn't work for you or is cumbersome, please open a new issue to let
22-
us know about the problem. See `this section <https://github.com/openml/openml-python/blob/main/CONTRIBUTING.md#user-content-reporting-bugs>`_.
22+
us know about the problem. See `this section <https://github.com/openml/openml-python/blob/main/CONTRIBUTING.md>`_.
2323

2424
* `Cite OpenML <https://www.openml.org/cite>`_ if you use it in a scientific publication.
2525

openml/_api_calls.py

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -341,6 +341,9 @@ def _send_request( # noqa: C901
341341
response: requests.Response | None = None
342342
delay_method = _human_delay if config.retry_policy == "human" else _robot_delay
343343

344+
# Error to raise in case of retrying too often. Will be set to the last observed exception.
345+
retry_raise_e: Exception | None = None
346+
344347
with requests.Session() as session:
345348
# Start at one to have a non-zero multiplier for the sleep
346349
for retry_counter in range(1, n_retries + 1):
@@ -384,10 +387,7 @@ def _send_request( # noqa: C901
384387
# which means trying again might resolve the issue.
385388
if e.code != DATABASE_CONNECTION_ERRCODE:
386389
raise e
387-
388-
delay = delay_method(retry_counter)
389-
time.sleep(delay)
390-
390+
retry_raise_e = e
391391
except xml.parsers.expat.ExpatError as e:
392392
if request_method != "get" or retry_counter >= n_retries:
393393
if response is not None:
@@ -399,18 +399,21 @@ def _send_request( # noqa: C901
399399
f"Unexpected server error when calling {url}. Please contact the "
400400
f"developers!\n{extra}"
401401
) from e
402-
403-
delay = delay_method(retry_counter)
404-
time.sleep(delay)
405-
402+
retry_raise_e = e
406403
except (
407404
requests.exceptions.ChunkedEncodingError,
408405
requests.exceptions.ConnectionError,
409406
requests.exceptions.SSLError,
410407
OpenMLHashException,
411-
):
412-
delay = delay_method(retry_counter)
413-
time.sleep(delay)
408+
) as e:
409+
retry_raise_e = e
410+
411+
# We can only be here if there was an exception
412+
assert retry_raise_e is not None
413+
if retry_counter >= n_retries:
414+
raise retry_raise_e
415+
delay = delay_method(retry_counter)
416+
time.sleep(delay)
414417

415418
assert response is not None
416419
return response

openml/config.py

Lines changed: 17 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -243,14 +243,11 @@ def _setup(config: _Config | None = None) -> None:
243243
config_dir = config_file.parent
244244

245245
# read config file, create directory for config file
246-
if not config_dir.exists():
247-
try:
246+
try:
247+
if not config_dir.exists():
248248
config_dir.mkdir(exist_ok=True, parents=True)
249-
cache_exists = True
250-
except PermissionError:
251-
cache_exists = False
252-
else:
253-
cache_exists = True
249+
except PermissionError:
250+
pass
254251

255252
if config is None:
256253
config = _parse_config(config_file)
@@ -264,15 +261,21 @@ def _setup(config: _Config | None = None) -> None:
264261
set_retry_policy(config["retry_policy"], n_retries)
265262

266263
_root_cache_directory = short_cache_dir.expanduser().resolve()
264+
265+
try:
266+
cache_exists = _root_cache_directory.exists()
267+
except PermissionError:
268+
cache_exists = False
269+
267270
# create the cache subdirectory
268-
if not _root_cache_directory.exists():
269-
try:
271+
try:
272+
if not _root_cache_directory.exists():
270273
_root_cache_directory.mkdir(exist_ok=True, parents=True)
271-
except PermissionError:
272-
openml_logger.warning(
273-
"No permission to create openml cache directory at %s! This can result in "
274-
"OpenML-Python not working properly." % _root_cache_directory,
275-
)
274+
except PermissionError:
275+
openml_logger.warning(
276+
"No permission to create openml cache directory at %s! This can result in "
277+
"OpenML-Python not working properly." % _root_cache_directory,
278+
)
276279

277280
if cache_exists:
278281
_create_log_handlers()

openml/datasets/dataset.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -589,7 +589,6 @@ def _load_data(self) -> tuple[pd.DataFrame | scipy.sparse.csr_matrix, list[bool]
589589
fpath = self.data_feather_file if self.cache_format == "feather" else self.data_pickle_file
590590
logger.info(f"{self.cache_format} load data {self.name}")
591591
try:
592-
assert self.data_pickle_file is not None
593592
if self.cache_format == "feather":
594593
assert self.data_feather_file is not None
595594
assert self.feather_attribute_file is not None
@@ -599,6 +598,7 @@ def _load_data(self) -> tuple[pd.DataFrame | scipy.sparse.csr_matrix, list[bool]
599598
with open(self.feather_attribute_file, "rb") as fh: # noqa: PTH123
600599
categorical, attribute_names = pickle.load(fh) # noqa: S301
601600
else:
601+
assert self.data_pickle_file is not None
602602
with open(self.data_pickle_file, "rb") as fh: # noqa: PTH123
603603
data, categorical, attribute_names = pickle.load(fh) # noqa: S301
604604
except FileNotFoundError as e:
@@ -681,14 +681,13 @@ def _convert_array_format(
681681
if array_format == "array" and not isinstance(data, scipy.sparse.spmatrix):
682682
# We encode the categories such that they are integer to be able
683683
# to make a conversion to numeric for backward compatibility
684-
def _encode_if_category(column: pd.Series) -> pd.Series:
684+
def _encode_if_category(column: pd.Series | np.ndarray) -> pd.Series | np.ndarray:
685685
if column.dtype.name == "category":
686686
column = column.cat.codes.astype(np.float32)
687687
mask_nan = column == -1
688688
column[mask_nan] = np.nan
689689
return column
690690

691-
assert isinstance(data, (pd.DataFrame, pd.Series))
692691
if isinstance(data, pd.DataFrame):
693692
columns = {
694693
column_name: _encode_if_category(data.loc[:, column_name])
@@ -1090,7 +1089,8 @@ def _get_qualities_pickle_file(qualities_file: str) -> str:
10901089
return qualities_file + ".pkl"
10911090

10921091

1093-
def _read_qualities(qualities_file: Path) -> dict[str, float]:
1092+
def _read_qualities(qualities_file: str | Path) -> dict[str, float]:
1093+
qualities_file = Path(qualities_file)
10941094
qualities_pickle_file = Path(_get_qualities_pickle_file(str(qualities_file)))
10951095
try:
10961096
with qualities_pickle_file.open("rb") as fh_binary:

0 commit comments

Comments
 (0)