Skip to content

Performance of bioio compared to native readers #144

@frauzufall

Description

@frauzufall

Issue

First of all, thank you so much for your work on bioio!

We are experiencing significant performance differences in image reading between bioio and the respective readers wrapped by bioio and we would like to know if there is a potential to improve this.

This issue is related to #130.

Background

We want to use bioio for pixel-patrol, a tool for assessing image quality and consistency within and between different image collections. Bioio seems to be the perfect fit for unifying metadata readouts across formats. Therefore, we load each image of a folder using bioio and write statistics and metadata for each file into one big table.

How to reproduce

Code for benchmarking and profiling https://gist.github.com/frauzufall/a4c5b82cafc1c9707c2c8ffd07dd1107 or run it via uv directly:

uv run https://gist.githubusercontent.com/frauzufall/a4c5b82cafc1c9707c2c8ffd07dd1107/raw/a45513e210111624174b251477c69c0ae8830ea8/benchmark_bioio_vs_native.py

Here are some statistics:

PNG:

======================================================================
               PNG Loading Speed Comparison Report              
======================================================================
Number of runs per file: 50

File Name                 | Size (MB)  | imageio (s)     | bioio (s)       | % Higher  
--------------------------------------------------------------------------------
test_image_1000x1000.png  | 0.01       | 0.005756        | 0.031467        | 446.67   %
test_image_100x100.png    | 0.00       | 0.000239        | 0.006888        | 2778.68  %
test_image_2000x2000.png  | 0.02       | 0.037071        | 0.128597        | 246.90   %
test_image_4000x4000.png  | 0.07       | 0.152327        | 0.553192        | 263.16   %
test_image_500x500.png    | 0.00       | 0.001510        | 0.012852        | 751.21   %
test_image_8000x8000.png  | 0.25       | 0.644278        | 2.054795        | 218.93   %

======================================================================
Overall Summary:
----------------------------------------------------------------------
Total average loading time across all PNG images (Imageio): 0.841181 s
Total average loading time across all PNG images (BioIO):    2.787791 s

Conclusion: BioIO (PNG) is slower than Imageio (PNG) by approximately 231.41% (Total difference: 1.946610 s).

TIFF:

======================================================================
               TIFF Loading Speed Comparison Report              
======================================================================
Number of runs per file: 50

File Name                 | Size (MB)  | tifffile (s)    | bioio (s)       | % Higher  
--------------------------------------------------------------------------------
test_image_1000x1000.tiff | 0.95       | 0.000207        | 0.003340        | 1515.81  %
test_image_100x100.tiff   | 0.01       | 0.000150        | 0.002234        | 1386.47  %
test_image_2000x2000.tiff | 3.81       | 0.000362        | 0.007007        | 1833.80  %
test_image_4000x4000.tiff | 15.26      | 0.001776        | 0.028777        | 1520.20  %
test_image_500x500.tiff   | 0.24       | 0.000153        | 0.003179        | 1980.20  %
test_image_8000x8000.tiff | 61.04      | 0.014028        | 0.104423        | 644.39   %

======================================================================
Overall Summary:
----------------------------------------------------------------------
Total average loading time across all TIFF images (Tifffile): 0.016676 s
Total average loading time across all TIFF images (BioIO):    0.148960 s

Conclusion: BioIO (TIFF) is slower than Tifffile (TIFF) by approximately 793.24% (Total difference: 0.132283 s).


And some screenshots from profiling (first bioio, then the native reader).

PNG (bioio with imageio):
Image

PNG (imageio):
Image

TIFF (bioio with tifffile):
Image

TIFF (tifffile)
Image

Bonus screenshot from TIFF using bioio, but with many small files (the plugin discovery comes up here more significantly):
Image

Wild guesses

Without having any knowledge about the bioio implementation, it looks like the same read method is called twice? And for TIFF, tokenizing seems expensive and is also called repeatedly, is it required?

These seem to be issues related to delayed vs direct array loading. ChatGPT is of the opinion that this line is problematic, but I don't feel competent enough to judge its significance or implications.

Also, the plugin discovery mechanism is quite costly if one uses bioio in a loop on many images, can one cache this somehow?

We wonder if there are ways to improve the performance of bioio and, if needed, are also happy to contribute to efforts in that direction.

Best,
Deborah

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions