|
| 1 | +# Blosc2 filter for HDF5 |
| 2 | + |
| 3 | +This is a filter for HDF5 that uses the Blosc2 compressor; by installing this |
| 4 | +filter, you can read and write HDF5 files with Blosc2-compressed datasets. |
| 5 | + |
| 6 | +You need to be a bit careful before using this filter because you |
| 7 | +should not activate the shuffle right in HDF5, but rather from Blosc2 |
| 8 | +itself. This is because Blosc2 uses an internal SIMD shuffle, which |
| 9 | +is much faster. |
| 10 | + |
| 11 | +## Installing the Blosc2 filter plugin |
| 12 | + |
| 13 | +Instead of just linking this Blosc2 filter into your HDF5 application, it is possible to install |
| 14 | +it as a system-wide HDF5 plugin (with HDF5 1.8.11 or later). This is useful because it allows |
| 15 | +*every* HDF5-using program on your system to transparently read Blosc2-compressed HDF5 files. |
| 16 | + |
| 17 | +As described in the `HDF5 plugin documentation <https://portal.hdfgroup.org/display/HDF5/HDF5+Dynamically+Loaded+Filters>`_, you just need to compile the Blosc2 plugin into a shared library and |
| 18 | +copy it to the plugin directory (which defaults to ``/usr/local/hdf5/lib/plugin`` on non-Windows systems). |
| 19 | + |
| 20 | +Following the ``cmake`` instructions below produces a ``libH5Zblosc2.so`` shared library |
| 21 | +file (or ``.dylib``/``.dll`` on Mac/Windows), that you can copy to the HDF5 plugin directory. |
| 22 | + |
| 23 | +To *write* Blosc2-compressed HDF5 files, on the other hand, an HDF5 using program must be |
| 24 | +specially modified to enable the Blosc2 filter when writing HDF5 datasets, as described below. |
| 25 | + |
| 26 | + |
| 27 | +## Linking the Blosc2 filter directly into your program |
| 28 | + |
| 29 | +Instead of (or in addition to) installing the Blosc2 plugin system-wide as |
| 30 | +described above, you can also link the Blosc2 filter directly into your |
| 31 | +application. Although this only makes the Blosc2 filter available in |
| 32 | +your application (as opposed to other HDF5-using applications), it |
| 33 | +is useful in cases where installing the plugin is inconvenient. Compile |
| 34 | +the Blosc2 filter as described above, but link ``libblosc2_filter.a`` |
| 35 | +(generated by ``make``) directly into your program. |
| 36 | + |
| 37 | +To register Blosc2 in your HDF5 application, you then need to call |
| 38 | +a function in blosc2_filter.h, with the following signature: |
| 39 | + |
| 40 | +```C |
| 41 | +int register_blosc2(char **version, char **date) |
| 42 | +``` |
| 43 | +
|
| 44 | +Calling this will register the filter with the HDF5 library and will |
| 45 | +return info about the Blosc2 release in `**version` and `**date` |
| 46 | +char pointers. |
| 47 | +
|
| 48 | +A non-negative return value indicates success. If the registration |
| 49 | +fails, an error is pushed onto the current error stack and a negative |
| 50 | +value is returned. |
| 51 | +
|
| 52 | +An example C program ('src/example.c') is included which demonstrates |
| 53 | +the proper use of the filter. |
| 54 | +
|
| 55 | +This filter has been tested against HDF5 versions 1.6.5 through |
| 56 | +1.8.10. It is released under the MIT license (see LICENSE.txt for |
| 57 | +details). |
| 58 | +
|
| 59 | +## Using the Blosc2 filter in your application |
| 60 | +
|
| 61 | +Assuming the filter is installed (either by a system-wide plugin or registered |
| 62 | +directly in your program as described above), your application can transparently |
| 63 | +*read* HDF5 files with Blosc2-compressed datasets. (The HDF5 library will detect |
| 64 | +that the dataset is Blosc2-compressed and invoke the filter automatically). |
| 65 | +
|
| 66 | +To *write* an HDF5 file with a Blosc2-compressed dataset, you call the |
| 67 | +`H5Pset_filter <https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetFilter>`_ function |
| 68 | +on the property list of the dataset you are creating, and pass ``FILTER_BLOSC2`` |
| 69 | +(defined in ``blosc2_filter.h``) for the ``filter_id`` parameter. In addition, HDF5 |
| 70 | +only supports compression for "chunked" datasets; this just means that you need to |
| 71 | +call `H5Pset_chunk <https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetChunk>`_ to |
| 72 | +specify a chunk size (e.g. 1MB chunks), and the subsequent chunking of the dataset I/O |
| 73 | +is performed transparently by HDF5. |
| 74 | +
|
| 75 | +## Compiling |
| 76 | +
|
| 77 | +The filter consists of a single 'src/blosc2_filter.c' source file and |
| 78 | +'src/blosc2_filter.h' header, which will need the Blosc2 library |
| 79 | +installed to work. It is simplest to just use the provided ``cmake`` |
| 80 | +build scripts, which compile and both the filter and the Blosc2 library |
| 81 | +into a library for you |
| 82 | +
|
| 83 | +Assuming you have `cmake <http://www.cmake.org/>`_ and other standard |
| 84 | +Unix build tools installed, do:: |
| 85 | +
|
| 86 | + mkdir build |
| 87 | + cd build |
| 88 | + cmake .. |
| 89 | + make |
| 90 | +
|
| 91 | +This generates the library/plugin files required above in the ``build`` |
| 92 | +directory. |
| 93 | +
|
| 94 | +## Acknowledgments |
| 95 | +
|
| 96 | +This filter was initially written by Francesc Alted, based on the |
| 97 | +previous Blosc filter for HDF5. Ivan Vilata i Balaguer contributed |
| 98 | +the support for Blosc2 NDim format. |
| 99 | +
|
| 100 | +Oscar Guiñón and Tom Birch helped in testing, debugging and fixing |
| 101 | +the filter. Thomas Vincent contributed with the integration of this |
| 102 | +work into the [hdf5plugin](https://hdf5plugin.readthedocs.io) project. |
| 103 | +
|
| 104 | +## License |
| 105 | +
|
| 106 | +This software is released under the MIT license. See LICENSE.txt for |
| 107 | +details. |
| 108 | +
|
| 109 | + **Enjoy data!** |
0 commit comments