Skip to content

Releases: Blosc/python-blosc2

Release 3.7.2

26 Aug 11:24
Compare
Choose a tag to compare

Changes from 3.7.1 to 3.7.2

  • C-Blosc2 internal library updated to latest 2.21.1.

  • Revert signature of TreeStore.__init__ for making benchmarks to get back
    to normal performance.

Release 3.7.1

17 Aug 11:40
Compare
Choose a tag to compare

Changes from 3.7.0 to 3.7.1

  • Added C2Array.slice() method and C2Array.nbytes, C2Array.cbytes, C2Array.cratio, C2Array.vlmeta and C2Array.info properties (PR #455).

  • Many usability improvements to the TreeStore class and friends.

  • New section about TreeStore in basics NDArray tutorial.

  • New blog post about TreeStore usage and performance at: https://www.blosc.org/posts/new-treestore-blosc2

  • C-Blosc2 internal library updated to latest 2.21.0.

Blosc2 v3.7.0

12 Aug 14:58
Compare
Choose a tag to compare

Changes from 3.6.1 to 3.7.0

  • Overhaul of documentation (API reference and Tutorials)

  • Improvements to lazy expression indexing and in particular much more efficient memory usage when applying non-unit steps (PR #446).

  • Extended functionality of expand_dims to match that of NumPy (note that this breaks the previous API) (PR #453).

  • The biggest change is in the form of three new data storage classes (EmbedStore, DictStore and TreeStore) which allow for the efficient storage of heterogeneous array data (PR #451). EmbedStore is essentially an SChunk wrapper which can be stored on-disk or in-memory; DictStore allows for mixed storage across memory, disk or indeed remote; and TreeStore is a hieracrhically-formatted version of DictStore which mimics the HDF5 file format. Write, access and storage performance are all very competitive with other packages - see plots here.

Blosc2 v3.6.1

17 Jul 16:38
Compare
Choose a tag to compare

Changes in Blosc2 3.6.1

  • Point to C-blosc2 v2.19.1

Changes from Blosc2 3.6.0

  • Expose the oindex C-level functionality in Blosc2 for NDArray.

  • Implement fancy indexing which closely matches NumPy functionality, using ndindex library. Includes a fast path for 1D arrays, based on Zarr's implementation.

  • A major refactoring of slicing for lazy expressions using ndindex. We have also added support for slices with non-unit steps for reduction expressions, which has introduced improvements that could be incorporated into other lazy expression machinery in the future.

  • More complex slicing is now supported.

  • Minor bug fixes to ensure that Blosc2 indexing does not introduce dummy dimensions when NumPy does not, and a more comprehensive squeeze function which squeezes specified dimensions.

Blosc2 v3.6.0

17 Jul 15:41
Compare
Choose a tag to compare

Changes in Blosc2 3.6.0

  • Expose the oindex C-level functionality in Blosc2 for NDArray.

  • Implement fancy indexing which closely matches NumPy functionality, using
    ndindex library. Includes a fast path for 1D arrays, based on Zarr's implementation.

  • A major refactoring of slicing for lazy expressions using ndindex. We have also
    added support for slices with non-unit steps for reduction expressions, which has introduced
    improvements that could be incorporated into other lazy expression machinery in the future.
    More complex slicing is now supported.

  • Minor bug fixes to ensure that Blosc2 indexing does not introduce dummy dimensions when NumPy does not,
    and a more comprehensive squeeze function which squeezes specified dimensions.

Release 3.5.1

03 Jul 08:54
Compare
Choose a tag to compare

Changes from 3.5.0 to 3.5.1

  • Reduced memory usage when computing slices of lazy expressions.
    This is a significant improvement for large arrays (up to 20x less).
    Also, we have added a fast path for slices that are small and fit in
    memory, which can be up to 20x faster than the previous implementation.
    See PR #430.

  • blosc2.concatenate() has been renamed to blosc2.concat().
    This is in line with the Array API.
    The old name is still available for backward compatibility, but it will
    be removed in a future release.

  • Improve mode handling for concatenating to disk. See PR #428.
    Useful for concatenating arrays that are stored in disk, and allows
    specifying the mode to use when concatenating.

Release 3.5.0

24 Jun 15:31
Compare
Choose a tag to compare

Changes from 3.4.0 to 3.5.0

  • New blosc2.stack() function for stacking multiple arrays along a new axis.
    Useful for creating multi-dimensional arrays from multiple 1D arrays.
    See PR #427. Thanks to Luke Shaw for the implementation!
    Blog: https://www.blosc.org/posts/blosc2-new-concatenate/#stacking-arrays

  • New blosc2.expand_dims() function for expanding the dimensions of an array.
    This is useful for adding a new axis to an array, similar to NumPy's np.expand_dims().
    See PR #427. Thanks to Luke Shaw for the implementation!

v3.4.0

13 Jun 13:24
Compare
Choose a tag to compare

Summary

This release adds significant new functionality in the form of concatenate. We support general concatenation of ndarrays, and offer an optimised path with significant speedups for the case of concatenating arrays with compatible chunk and blockshapes. In addition, there are bug fixes and more functionality for slicing of lazyexprs, and the possibility to jit compile user-defined functions which operate on pandas objects using the blosc2 engine.

What's Changed

Full Changelog: v3.3.4...v3.4.0

Blosc2 v3.3.4

22 May 11:04
Compare
Choose a tag to compare

This is a bugfix release, with some minor optimizations. We further improved the
correct chaining of string lazy expressions (to allow operands with more
diverse data types). In addition, both indexing and where expressions are now
supported within string lazy expressions. Finally, casting rules have
been improved to be more consistent with NumPy. In summary:

  • Expand possibilities for chaining string-based lazy expressions to incorporate
    data types which do not have shape attribute, e.g. int, float etc.
    See #406 and PR #411.

  • Enable slicing within string-based lazy expressions. See PR #414.

  • Improved casting for string-based lazy expressions.

  • Documentation improvements, see PR #410.

  • Compatibility fixes for working with h5py files.

Release 3.3.3

14 May 16:00
Compare
Choose a tag to compare

Changes from 3.3.2 to 3.3.3

  • Expand possibilities for chaining string-based lazy expressions to include
    main operand types (LazyExpr and NDArray). Still have to incorporate other
    data types (which do not have shape attribute, e.g. int, float etc.).
    See #406.

  • Fix indexing for lazy expressions, and allow use of None in getitem.
    See PR #402.

  • Fix incorrect appending of dim to computed reductions. See PR #404.

  • Fix blosc2.linspace() for incompatible num/shape. See PR #408.

  • Add support for NumPy dtypes that are n-dimensional (e.g.
    np.dtype(("<i4,>f4", (10,))),).

  • New MAX_DIM constant for the maximum number of dimensions supported.
    This is useful for checking if a given array is too large to be handled.

  • More refinements on guessing cache sizes for Linux.

  • Update to C-Blosc2 2.17.2.dev. Now, we are forcing the flush of modified
    pages only in write mode for mmap files. This fixes mmap issues on Windows.
    Thanks to @JanSellner for the implementation.