Skip to content

Fractal/NGIO logging #867

@tcompa

Description

@tcompa

Re: logging (extracting the discussion from #866, cc @lorenzocerrone @jluethi )


  1. The place where we want to be careful about logging is ngio, because it's a library that will be re-used in other Python modules. The same applies to fractal-tasks-core core library, but loggers defined in fractal-tasks-core tasks are not re-used by anyone outside that task - thus their only requirement is that their format includes a timestamp (unrelated: possibly in a custom timezone).

  2. Logger names are used as logger identifier by the logging library, and calling logging.getLogger("some.logger") twice returns the same object - even from two different modules (see e.g. here).

  3. Right now, fractal-tasks-core logging is configured in a way that does not display the logger name (details in "Where does the logger name appear?" below). This was less relevant in the past, because most/all logs came from the same logger (fractal-tasks-core). Now that we are including two different loggers, I suggest we pick reasonable/brief names and display them.

  4. Loggers should not be passed around (across functions or modules) as Python objects. That is, I think we should move from

from ngio.utils import ngio_logger

to

from ngio.utils import ngio_logger_name
ngio_loggger = logging.getLogger(ngio_logger_name)

This is my understanding of https://docs.python.org/3/howto/logging-cookbook.html#using-logging-in-multiple-modules and https://docs.python.org/3/howto/logging-cookbook.html#using-loggers-as-attributes-in-a-class-or-passing-them-as-parameters, and that's also how we are typically setting up things in fractal repositories.

  1. In general, I don't think we should ever import/use the ngio logger in fractal-tasks-core tasks.
    The reason becomes more evident in a case where we do show logger names:
# This seems correct, with different loggers for the current task and for ngio operations
2024-10-10 10:10:10 - fractal_tasks_core.tasks.projection - INFO - start task
2024-10-10 10:10:11 - fractal_tasks_core.tasks.projection - INFO - zarr_url=/some/ome.zarr/B/03/0
2024-10-10 10:10:12 - fractal_tasks_core.tasks.projection - INFO - preliminary checks are OK
2024-10-10 10:10:13 - ngio.loader - WARNING - I found the wrong dtype in some zarr array, continue anyway
2024-10-10 10:10:14 - fractal_tasks_core.tasks.projection - INFO - end task

# This seems wrong: the "start task" does not belong to ngio
2024-10-10 10:10:10 - ngio.something - INFO - start task
2024-10-10 10:10:11 - ngio.something - INFO - zarr_url=/some/ome.zarr/B/03/0
2024-10-10 10:10:12 - ngio.something - INFO - preliminary checks are OK
2024-10-10 10:10:13 - ngio.loader - WARNING - I found the wrong dtype in some zarr array, continue anyway
2024-10-10 10:10:14 - ngio.something - INFO - end task

(note that I showed the most complete example where logger names also include the specific module, but they could also simply be projection_task and ngio).

Where does the logger name appear?

When does run_fractal_task use the logger name? (by @jluethi)

The logger name determines which logger will be used in this block

    # Run task
    logger.info(f"START {task_function.__name__} task")
    metadata_update = task_function(**pars)
    logger.info(f"END {task_function.__name__} task")

However, this logger_name is currently not displayed in any log, due to our logging configuration (defined in __init__.py).
This snippet clarifies how we are configuring logging now, and how we could change it if we wanted to display the name:

import logging

# Create two loggers
logger1 = logging.getLogger("logger_name_1")
logger2 = logging.getLogger("logger_name_2")

# Use the configuration which is currently in use for fractal-tasks-core loggers
logging.basicConfig(level=logging.INFO, format="%(asctime)s; %(levelname)s; %(message)s")

logger1.error("This is an error from logger 1")
# 2024-11-19 09:00:55,532; ERROR; This is an error from logger 1

logger2.error("This is an error from logger 2")
# 2024-11-19 09:00:55,532; ERROR; This is an error from logger 2

# Replace configuration with a new one, which includes logger name
logging.basicConfig(level=logging.INFO, format="%(asctime)s; %(name)s; %(levelname)s; %(message)s", force=True)

logger1.error("This is an error from logger 1")
# 2024-11-19 09:00:55,532; logger_name_1; ERROR; This is an error from logger 1

logger2.error("This is an error from logger 2")
# 2024-11-19 09:00:55,532; logger_name_2; ERROR; This is an error from logger 2

This is also clear in the following task logs, where the logger name does not appear anywhere:

2024-11-18 12:53:24,431; INFO; START cellvoyager_to_ome_zarr_init task
2024-11-18 12:53:24,431; INFO; [glob_with_multiple_patterns] include_patterns=['*.png']
2024-11-18 12:53:24,438; INFO; [glob_with_multiple_patterns] Found 4 items
2024-11-18 12:53:24,438; INFO; Creating 20200812-CardiomyocyteDifferentiation14-Cycle1.zarr
2024-11-18 12:53:24,438; INFO; Start open_zarr_group_with_overwrite (overwrite=True).
2024-11-18 12:53:24,452; INFO; Zarr group /data/zarrs/1731934120-1/20200812-CardiomyocyteDifferentiation14-Cycle1.zarr already exists, with keys=['B']
2024-11-18 12:53:24,477; INFO; Read /data/images/10.5281_zenodo.8287221/MeasurementData.mlf, apply following include patterns to image filenames: None apply the following exlcude patterns to image filenames: None
2024-11-18 12:53:24,499; INFO; Expected number of images for well B03: 4
2024-11-18 12:53:24,500; INFO; removing FOV overlaps for well='B03'
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] include_patterns=['20200812-CardiomyocyteDifferentiation14-Cycle1_*.png']
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] Found 4 items
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] include_patterns=['20200812-CardiomyocyteDifferentiation14-Cycle1_B03_*.png']
2024-11-18 12:53:24,502; INFO; [glob_with_multiple_patterns] Found 4 items
/data/remote/tasks/1/fractal-tasks-core/1.3.2/venv/lib/python3.10/site-packages/pydantic/main.py:364: UserWarning: Pydantic serializer warnings:
  Expected `int` but got `int64` - serialized value may not be as expected
  return self.__pydantic_serializer__.to_python(
2024-11-18 12:53:24,531; INFO; END cellvoyager_to_ome_zarr_init task

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    TODO

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions