Skip to content

Commit 502a11b

Browse files
jstriebelnormanrz
andauthored
reproducable generated client, smaller tests (#875)
* make generated client reproducible by removing examples * specify poetry install --all-extras where possible * smaller tests * type fix * ci timeout * clean up multiprocessing loggers * compatibility fix --------- Co-authored-by: Norman Rzepka <[email protected]>
1 parent f5f5a0b commit 502a11b

File tree

93 files changed

+14294
-151807
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

93 files changed

+14294
-151807
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,7 @@ jobs:
170170
run: ./typecheck.sh
171171

172172
- name: Python tests
173+
timeout-minutes: 30
173174
env:
174175
WK_TOKEN: ${{ secrets.WK_TOKEN }}
175176
run: ./test.sh --splits 3 --group ${{ matrix.group }} --splitting-algorithm least_duration

CONTRIBUTING.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,8 @@ The [WEBKNOSSOS-libs repository](https://github.com/scalableminds/webknossos-lib
104104
See below for specifics of the different packages. Let's have a look at the common tooling first:
105105

106106
* [**poetry**](https://python-poetry.org) is used for dependency management and publishing.
107-
By default, it creates a [virtual environment](https://docs.python.org/3/tutorial/venv.html) for each package.
107+
Use `poetry install --all-extras` in each package folder to install all dependencies for development.
108+
By default, this creates a [virtual environment](https://docs.python.org/3/tutorial/venv.html) for each package.
108109
To run commands inside this package, prefix them with `poetry run`, e.g. `poetry run python myscript.py`,
109110
or enter the virtual environment with `poetry shell`.
110111
The creation of a separate environment can be disabled (e.g. if you want to manage this manually),

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ update-internal:
2222
$(call in_each_pkg_by_dependency, poetry update $(packages_by_dependency))
2323

2424
install:
25-
$(call in_each_pkg_by_dependency, poetry install --extras all || poetry install)
25+
$(call in_each_pkg_by_dependency, poetry install --all-extras)
2626

2727
format:
2828
$(call in_each_code_pkg, ./format.sh)

cluster_tools/cluster_tools/_utils/multiprocessing_logging_handler.py

Lines changed: 53 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
from logging.handlers import QueueHandler
1010
from queue import Empty as QueueEmpty
1111
from queue import Queue
12-
from typing import Any, List
12+
from typing import Any, Callable, List, Tuple
1313

1414
# Inspired by https://stackoverflow.com/a/894284
1515

@@ -33,12 +33,14 @@ def __init__(self, name: str, wrapped_handler: logging.Handler) -> None:
3333
self.setFormatter(self.wrapped_handler.formatter)
3434
self.filters = self.wrapped_handler.filters
3535

36-
self.queue = multiprocessing.Manager().Queue(-1)
36+
self._manager = multiprocessing.Manager()
37+
self.queue = self._manager.Queue(-1)
3738
self._is_closed = False
3839
# Use thread to asynchronously receive messages from the queue
3940
self._queue_thread = threading.Thread(target=self._receive, name=name)
4041
self._queue_thread.daemon = True
4142
self._queue_thread.start()
43+
self._usage_counter = 1
4244

4345
def _receive(self) -> None:
4446
while True:
@@ -72,11 +74,27 @@ def _receive(self) -> None:
7274
def emit(self, record: logging.LogRecord) -> None:
7375
self.wrapped_handler.emit(record)
7476

77+
def increment_usage(self) -> None:
78+
self._usage_counter += 1
79+
80+
def decrement_usage(self) -> None:
81+
self._usage_counter -= 1
82+
if self._usage_counter == 0:
83+
# unwrap inner handler:
84+
root_logger = getLogger()
85+
root_logger.removeHandler(self)
86+
root_logger.addHandler(self.wrapped_handler)
87+
88+
self._is_closed = True
89+
self._queue_thread.join()
90+
self._manager.shutdown()
91+
super().close()
92+
7593
def close(self) -> None:
7694
if not self._is_closed:
7795
self._is_closed = True
7896
self._queue_thread.join()
79-
97+
self._manager.shutdown()
8098
self.wrapped_handler.close()
8199
super().close()
82100

@@ -101,33 +119,36 @@ def _setup_logging_multiprocessing(
101119
root_logger.addHandler(handler)
102120

103121

104-
def _get_multiprocessing_logging_setup_fn() -> Any:
105-
root_logger = getLogger()
122+
class _MultiprocessingLoggingHandlerPool:
123+
def __init__(self) -> None:
124+
root_logger = getLogger()
125+
126+
self.handlers = []
127+
for i, handler in enumerate(list(root_logger.handlers)):
128+
# Wrap logging handlers in _MultiprocessingLoggingHandlers to make them work in a multiprocessing setup
129+
# when using start_methods other than fork, for example, spawn or forkserver
130+
if not isinstance(handler, _MultiprocessingLoggingHandler):
131+
mp_handler = _MultiprocessingLoggingHandler(
132+
f"multi-processing-handler-{i}", handler
133+
)
134+
root_logger.removeHandler(handler)
135+
root_logger.addHandler(mp_handler)
136+
self.handlers.append(mp_handler)
137+
else:
138+
handler.increment_usage()
139+
self.handlers.append(handler)
140+
141+
def get_multiprocessing_logging_setup_fn(self) -> Callable[[], None]:
142+
# Return a logging setup function that when called will setup QueueHandler loggers
143+
# using the queues of the _MultiprocessingLoggingHandlers. This way all log messages
144+
# are forwarded to the main process.
145+
return functools.partial(
146+
_setup_logging_multiprocessing,
147+
queues=[handler.queue for handler in self.handlers],
148+
levels=[handler.level for handler in self.handlers],
149+
filters=warnings.filters,
150+
)
106151

107-
queues = []
108-
levels = []
109-
for i, handler in enumerate(list(root_logger.handlers)):
110-
# Wrap logging handlers in _MultiprocessingLoggingHandlers to make them work in a multiprocessing setup
111-
# when using start_methods other than fork, for example, spawn or forkserver
112-
if not isinstance(handler, _MultiprocessingLoggingHandler):
113-
mp_handler = _MultiprocessingLoggingHandler(
114-
f"multi-processing-handler-{i}", handler
115-
)
116-
117-
root_logger.removeHandler(handler)
118-
root_logger.addHandler(mp_handler)
119-
else:
120-
mp_handler = handler
121-
122-
queues.append(mp_handler.queue)
123-
levels.append(mp_handler.level)
124-
125-
# Return a logging setup function that when called will setup QueueHandler loggers
126-
# reusing the queues of each wrapped _MultiprocessingLoggingHandler. This way all log messages
127-
# are forwarded to the main process.
128-
return functools.partial(
129-
_setup_logging_multiprocessing,
130-
queues,
131-
levels,
132-
filters=warnings.filters,
133-
)
152+
def close(self) -> None:
153+
for handler in self.handlers:
154+
handler.decrement_usage()

cluster_tools/cluster_tools/executors/multiprocessing.py

Lines changed: 26 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626

2727
from cluster_tools._utils import pickling
2828
from cluster_tools._utils.multiprocessing_logging_handler import (
29-
_get_multiprocessing_logging_setup_fn,
29+
_MultiprocessingLoggingHandlerPool,
3030
)
3131
from cluster_tools._utils.warning import enrich_future_with_uncaught_warning
3232

@@ -80,6 +80,10 @@ def __init__(
8080
initializer=initializer,
8181
initargs=initargs,
8282
)
83+
if self._mp_context.get_start_method() == "fork":
84+
self._mp_logging_handler_pool = None
85+
else:
86+
self._mp_logging_handler_pool = _MultiprocessingLoggingHandlerPool()
8387

8488
def submit( # type: ignore[override]
8589
self,
@@ -111,25 +115,27 @@ def submit( # type: ignore[override]
111115
# The call_stack holds all of these wrapper functions and their arguments in the correct order.
112116
# For example, call_stack = [wrapper_fn_1, wrapper_fn_1_arg_1, wrapper_fn_2, actual_fn, actual_fn_arg_1]
113117
# where wrapper_fn_1 is called, which eventually calls wrapper_fn_2, which eventually calls actual_fn.
114-
call_stack = []
118+
call_stack: List[Callable] = []
115119

116-
if self._mp_context.get_start_method() != "fork":
120+
if self._mp_logging_handler_pool is not None:
117121
# If a start_method other than the default "fork" is used, logging needs to be re-setup,
118122
# because the programming context is not inherited in those cases.
119-
multiprocessing_logging_setup_fn = _get_multiprocessing_logging_setup_fn()
120-
call_stack.extend(
121-
[
123+
multiprocessing_logging_setup_fn = (
124+
self._mp_logging_handler_pool.get_multiprocessing_logging_setup_fn()
125+
)
126+
call_stack.append(
127+
partial(
122128
MultiprocessingExecutor._setup_logging_and_execute,
123129
multiprocessing_logging_setup_fn,
124-
]
130+
)
125131
)
126132

127133
if output_pickle_path is not None:
128-
call_stack.extend(
129-
[
134+
call_stack.append(
135+
partial(
130136
MultiprocessingExecutor._execute_and_persist_function,
131137
output_pickle_path,
132-
]
138+
)
133139
)
134140

135141
fut = submit_fn(*call_stack, __fn, *args, **kwargs)
@@ -250,3 +256,13 @@ def forward_log(self, fut: "Future[_T]") -> _T:
250256
# Since the default behavior of process pool executors is to show the log in the main process
251257
# we don't need to do anything except for blocking until the future is done.
252258
return fut.result()
259+
260+
def shutdown(self, wait: bool = True, *, cancel_futures: bool = False) -> None:
261+
if cancel_futures:
262+
# cancel_futures was added in Python 3.9, ignoring it as 3.8 is supported:
263+
logging.warning(
264+
"The provided cancel_futures argument is ignored by MultiprocessingExecutor."
265+
)
266+
super().shutdown(wait=wait)
267+
if self._mp_logging_handler_pool is not None:
268+
self._mp_logging_handler_pool.close()

webknossos/__generate_client.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -428,6 +428,19 @@ def make_properties_required( # pylint: disable=dangerous-default-value
428428
return handled_required_keys
429429

430430

431+
def remove_examples(
432+
x: Any,
433+
) -> None:
434+
if isinstance(x, dict):
435+
if "example" in x:
436+
del x["example"]
437+
for i in x.values():
438+
remove_examples(i)
439+
elif isinstance(x, list):
440+
for i in x:
441+
remove_examples(i)
442+
443+
431444
def set_response_schema_by_example(
432445
openapi_schema: Dict,
433446
example_response: bytes,
@@ -453,6 +466,7 @@ def set_response_schema_by_example(
453466
if path_method["operationId"] == operation_id
454467
][0]
455468
handled_required_keys = make_properties_required(recorded_response_schema)
469+
remove_examples(recorded_response_schema)
456470
request_schema["responses"]["200"]["content"] = recorded_response_schema
457471
return handled_required_keys
458472

webknossos/examples/download_segments.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
ANNOTATION_ID = "634e8fe1010000b4006f3cf4"
77
SEGMENT_IDS = [32, 667325]
8-
MAG = wk.Mag("4-4-1")
8+
MAG = wk.Mag("8-8-2")
99

1010

1111
def main() -> None:
@@ -14,7 +14,7 @@ def main() -> None:
1414
)
1515
mag_view = dataset.get_segmentation_layers()[0].get_mag(MAG)
1616

17-
z = mag_view.layer.bounding_box.topleft.z
17+
z = mag_view.bounding_box.topleft.z
1818
with mag_view.get_buffered_slice_reader() as reader:
1919
for slice_data in reader:
2020
slice_data = slice_data[0] # First channel only

webknossos/examples/download_tiff_stack.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
DATASET_NAME = "l4_sample"
66
LAYER_NAME = "color"
7-
MAG = wk.Mag("4-4-1")
7+
MAG = wk.Mag("16-16-4")
88

99

1010
def main() -> None:
@@ -15,7 +15,7 @@ def main() -> None:
1515
)
1616
mag_view = dataset.get_layer(LAYER_NAME).get_mag(MAG)
1717

18-
z = mag_view.layer.bounding_box.topleft.z
18+
z = mag_view.bounding_box.topleft.z
1919
with mag_view.get_buffered_slice_reader() as reader:
2020
for slice_data in reader:
2121
slice_data = slice_data[0] # First channel only

webknossos/examples/zarr_and_dask.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
import webknossos as wk
44

5+
MAG = wk.Mag("8-8-2")
6+
57

68
def main() -> None:
79
# Remote datasets are read-only, but can be used similar to normal datasets:
@@ -10,11 +12,11 @@ def main() -> None:
1012
)
1113

1214
layer = l4_sample_dataset.get_layer("color")
13-
mag = layer.get_finest_mag()
15+
mag_view = layer.get_mag(MAG)
1416

15-
zarr_array = mag.get_zarr_array()
17+
zarr_array = mag_view.get_zarr_array()
1618
dask_array = da.from_array(zarr_array, chunks=(1, 256, 256, 256))[
17-
(0,) + layer.bounding_box.to_slices()
19+
(0,) + mag_view.bounding_box.in_mag(MAG).to_slices()
1820
]
1921

2022
mean_value = dask_array.mean().compute()

webknossos/local_wk_setup.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
function export_vars {
22
export WK_TOKEN=1b88db86331a38c21a0b235794b9e459856490d70408bcffb767f64ade0f83d2bdb4c4e181b9a9a30cdece7cb7c65208cc43b6c1bb5987f5ece00d348b1a905502a266f8fc64f0371cd6559393d72e031d0c2d0cabad58cccf957bb258bc86f05b5dc3d4fff3d5e3d9c0389a6027d861a21e78e3222fb6c5b7944520ef21761e
33
export WK_URL=http://localhost:9000
4-
export DOCKER_TAG=master__10303
4+
export DOCKER_TAG=master__21723
55
}
66

77
function ensure_local_test_wk {
@@ -26,7 +26,7 @@ function ensure_local_test_wk {
2626
while ! curl -sf localhost:9000/api/health; do
2727
sleep 5
2828
done
29-
OUT=$(docker-compose exec -T webknossos tools/postgres/prepareTestDB.sh 2>&1) || echo "$OUT"
29+
OUT=$(docker-compose exec -T webknossos tools/postgres/dbtool.js prepare-test-db 2>&1) || echo "$OUT"
3030
popd > /dev/null
3131
else
3232
echo "Using the already running local webknossos setup at localhost:9000"

0 commit comments

Comments
 (0)