Skip to content

Handle additional race conditions in apply registration task #970

@jluethi

Description

@jluethi

As reported by Kelvin, the apply registration task can run into the following race condition:

JOB ERROR in Fractal job 2078:
TRACEBACK:
JobExecutionError

An error occurred.
Original error:
Traceback (most recent call last):
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/fractal-server-env-311/lib/python3.11/site-packages/fractal_server/app/runner/executors/call_command_wrapper.py", line 49, in call_command_wrapper
    raise TaskExecutionError(
fractal_server.app.runner.exceptions.TaskExecutionError: Task failed with returncode=1.
STDERR: 2025-07-28 15:06:59,052; INFO; START apply_registration_to_image task
2025-07-28 15:06:59,052; INFO; /data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/1
2025-07-28 15:06:59,052; INFO; Running `apply_registration_to_image` on zarr_url='/data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/1', registered_roi_table='registered_FOV_ROI_table' and reference_acquisition=0. Using overwrite_input=True
2025-07-28 15:07:36,951; INFO; Write the registered Zarr image to disk
2025-07-28 15:21:38,816; INFO; [build_pyramid] High-resolution path: /data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/1_registered/0
2025-07-28 15:21:42,313; INFO; [build_pyramid] High-resolution data: dask.array<from-zarr, shape=(4, 1, 8192, 8192), dtype=uint16, chunksize=(1, 1, 2048, 2048), chunktype=numpy.ndarray>
2025-07-28 15:21:42,316; INFO; [build_pyramid] Level 1 data: dask.array<rechunk-merge, shape=(4, 1, 4096, 4096), dtype=uint16, chunksize=(1, 1, 2048, 2048), chunktype=numpy.ndarray>
2025-07-28 15:25:33,778; INFO; [build_pyramid] Level 2 data: dask.array<rechunk-merge, shape=(4, 1, 2048, 2048), dtype=uint16, chunksize=(1, 1, 2048, 2048), chunktype=numpy.ndarray>
2025-07-28 15:26:25,836; INFO; Processing the tables: {'FOV_ROI_table': '/data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/0/tables/FOV_ROI_table', 'well_ROI_table': '/data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/0/tables/well_ROI_table', 'registered_FOV_ROI_table': '/data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/0/tables/registered_FOV_ROI_table'}
2025-07-28 15:26:26,640; INFO; Copying table: FOV_ROI_table
2025-07-28 15:26:59,073; INFO; Copying table: well_ROI_table
Traceback (most recent call last):
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/fractal_tasks_core/tasks/apply_registration_to_image.py", line 400, in <module>
    run_fractal_task(
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/fractal_task_tools/task_wrapper.py", line 66, in run_fractal_task
    metadata_update = task_function(**pars)
                      ^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/pydantic/validate_call_decorator.py", line 60, in wrapper_function
    return validate_call_wrapper(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/pydantic/_internal/_validate_call.py", line 96, in __call__
    res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/fractal_tasks_core/tasks/apply_registration_to_image.py", line 216, in apply_registration_to_image
    curr_table = ad.read_zarr(table_dict[table])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/_io/zarr.py", line 87, in read_zarr
    adata = read_dispatched(f, callback=callback)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/experimental/_dispatch_io.py", line 42, in read_dispatched
    return reader.read_elem(elem)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/_io/utils.py", line 211, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 271, in read_elem
    iospec = get_spec(elem)
             ^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 233, in get_spec
    {
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 234, in <dictcomp>
    k: _read_attr(elem.attrs, k, "")
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/PYTHON_TASKS/fractal-tasks-311/lib/python3.11/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/anndata/compat/__init__.py", line 169, in _read_attr
    return attrs.get(name, default=default)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen _collections_abc>", line 774, in get
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/zarr/attrs.py", line 74, in __getitem__
    return self.asdict()[item]
           ^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/zarr/attrs.py", line 55, in asdict
    d = self._get_nosync()
        ^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/zarr/attrs.py", line 42, in _get_nosync
    data = self.store[self.key]
           ~~~~~~~~~~^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/zarr/storage.py", line 1118, in __getitem__
    return self._fromfile(filepath)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/homes/fractal/20230627_joel_fractal/fractal-demos/examples/server/FRACTAL_TASKS_DIR/6/fractal-tasks-core/1.5.3/venv/lib/python3.11/site-packages/zarr/storage.py", line 1092, in _fromfile
    with open(fn, "rb") as f:
         ^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/data/active/kgroot/inmed_retrieve/20200314_GabriDrugs8hc1.zarr/E/12/0/tables/well_ROI_table/.zattrs'
Error raised while reading key '' of <class 'zarr.hierarchy.Group'> from /

We should add the FileNotFoundError error to the errors we catch & retry, see

except (
zarr.errors.GroupNotFoundError,
zarr.errors.PathNotFoundError,
):

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions