Skip to content

Conversation

@t20100
Copy link
Member

@t20100 t20100 commented Oct 24, 2025

This PR adds helpers to instantiate filter class from

  • Check stored cd_values for each filter (only tested for Bitshuffle and Blosc so far)
  • Missing filters: Sperr, SZ, SZ3, ZFP
  • Tests
  • Documentation
    - [ ] Check endianness issues with filter packing double or bits: Sperr, SZ, SZ3, ZFP
    - [ ] Check which versions of the filter are supported and raise an exception for unsupported version if the version is stored in the cd_values

Usage:

In [1]: import hdf5plugin
In [2]: hdf5plugin.from_filter_options(32001, (0, 0, 0, 0, 4, 1, 1))
Out[2]: <hdf5plugin._filters.Blosc at 0x7f718b30c6e0>
In [3]: hdf5plugin.from_filter_options(32008, (0, 2, 4, 0, 2))
Out[3]: <hdf5plugin._filters.Bitshuffle at 0x7f7188b9cda0>
In [4]: hdf5plugin.from_filter_options(32001, (0, 0, 0, 0, 10, 1, 1))
[...]
ValueError: clevel must be in the range [0, 9]

Related to #365

@t20100 t20100 marked this pull request as ready for review January 16, 2026 13:00
@t20100
Copy link
Member Author

t20100 commented Jan 16, 2026

As it is, reading hdf5 filter option works for the currently embedded version of the compression filters.
Checking that it works with older version would be needed, but I propose to do it another PR (related to #219).

Same for endianness checks, let's do it in a separate PR.

Copy link
Member

@payno payno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though I catch some byte operation error at one point... but no 😇
Thanks for the unit test :)

:raises ValueError: Unsupported filter_options
:raises NotImplementedError: Support of filter_options version is not implemented
"""
# ZFP header parsing reference:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope those won't be changed to much other time. Should it be part of the documentation ?

Comment on lines +805 to +816
if mode == 2:
return cls(
peak_signal_to_noise_ratio=quality,
swap=swap,
missing_value_mode=missing_value_mode,
)
if mode == 3:
return cls(
absolute=quality, swap=swap, missing_value_mode=missing_value_mode
)
if mode == 1:
return cls(rate=quality, swap=swap, missing_value_mode=missing_value_mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unexpected to have mode condition sorted as 2, 3, 1 instead of 1, 2, 3...
Except if there is a reason that I don't see at the moment...

Suggested change
if mode == 2:
return cls(
peak_signal_to_noise_ratio=quality,
swap=swap,
missing_value_mode=missing_value_mode,
)
if mode == 3:
return cls(
absolute=quality, swap=swap, missing_value_mode=missing_value_mode
)
if mode == 1:
return cls(rate=quality, swap=swap, missing_value_mode=missing_value_mode)
if mode == 1:
return cls(rate=quality, swap=swap, missing_value_mode=missing_value_mode)
if mode == 2:
return cls(
peak_signal_to_noise_ratio=quality,
swap=swap,
missing_value_mode=missing_value_mode,
)
if mode == 3:
return cls(
absolute=quality, swap=swap, missing_value_mode=missing_value_mode
)

Get dataset compression
+++++++++++++++++++++++

For compression filters provided by HDF5 and `h5py`_ (i.e., GZIP, LZF, SZIP),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

Suggested change
For compression filters provided by HDF5 and `h5py`_ (i.e., GZIP, LZF, SZIP),
For **built-in** compression filters (i.e., GZIP, LZF, SZIP),

`compression <https://docs.h5py.org/en/stable/high/dataset.html#h5py.Dataset.compression>`_ and
`compression_opts <https://docs.h5py.org/en/stable/high/dataset.html#h5py.Dataset.compression_opts>`_ properties.

For third-party compression filters such as the one supported by `hdf5plugin`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For third-party compression filters such as the one supported by `hdf5plugin`,
For **third-party** compression filters such as the one supported by `hdf5plugin`,

self.filter_options = (0, 0, 0, 0, clevel, shuffle, compression)

@classmethod
def _from_filter_options(cls, filter_options: tuple[int, ...]) -> Blosc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe those could be a use case for match...
But I think no much people are using them and this could be more confusing than something else

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants