Skip to content

Unable to open dataset that was created with Blosc(2) compression #1757

@Leonard-Mueller

Description

@Leonard-Mueller

Hi, I created a dataset with the following python code

    compression_options = hdf5plugin.Blosc2(
        cname='blosclz',  # Blosc2 supports 'zstd', 'lz4', 'blosclz', etc.
        clevel=9,  # Compression level (1-9)
        filters=hdf5plugin.Blosc2.SHUFFLE,  # Better for floating point data
    )

    if isinstance(da_arr.data, da.Array):
        chunks = da_arr.data.chunksize
        is_dask = True
    else:
        chunks = None
        is_dask = False
    dset = f.create_dataset(
        dset_name,
        shape=da_arr.shape,  # Keep original shape
        dtype=da_arr.dtype,  # Ensure proper data type
        chunks=chunks,  # copy chunksize of underlying dask array
        compression = compression_options
    )

The h5web (in vscode) viewer gives this error:

HDF5-DIAG: Error detected in HDF5 (1.14.2) thread 0:
  #000: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5D.c line 1061 in H5Dread(): can't synchronously read data
    major: Dataset
    minor: Read failed
  #001: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5D.c line 1008 in H5D__read_api_common(): can't read data
    major: Dataset
    minor: Read failed
  #002: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 2092 in H5VL_dataset_read_direct(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #003: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLcallback.c line 2048 in H5VL__dataset_read(): dataset read failed
    major: Virtual Object Layer
    minor: Read failed
  #004: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5VLnative_dataset.c line 363 in H5VL__native_dataset_read(): can't read data
    major: Dataset
    minor: Read failed
  #005: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dio.c line 279 in H5D__read(): can't initialize I/O info
    major: Dataset
    minor: Unable to initialize object
  #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dchunk.c line 1088 in H5D__chunk_io_init(): unable to create file and memory chunk selections
    major: Dataset
    minor: Unable to initialize object
  #007: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dchunk.c line 1231 in H5D__chunk_io_init_selections(): unable to create file chunk selections
    major: Dataset
    minor: Unable to initialize object
  #008: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Dchunk.c line 1692 in H5D__create_piece_file_map_all(): can't insert chunk into skip list
    major: Dataspace
    minor: Unable to insert object
  #009: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5SL.c line 1036 in H5SL_insert(): can't create new skip list node
    major: Skip Lists
    minor: Unable to insert object
  #010: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5SL.c line 709 in H5SL__insert_common(): can't insert duplicate key
    major: Skip Lists
    minor: Unable to insert object

This is the metadata displayed with h5web of the dataset:

  "name": "slope_map",
  "path": "/slope_map",
  "attributes": [
    {
      "name": "DIMENSION_LIST",
      "shape": [
        2
      ],
      "type": {
        "class": "Array (variable length)",
        "base": {
          "class": "Reference"
        }
      }
    }
  ],
  "kind": "dataset",
  "shape": [
    12000,
    12001
  ],
  "type": {
    "class": "Float",
    "endianness": "little-endian",
    "size": 32
  },
  "chunks": [
    1000,
    1000
  ],
  "filters": [
    {
      "id": 32026,
      "name": "blosc2",
      "cd_values": [
        1,
        0,
        4,
        4000000,
        9,
        1,
        0,
        2,
        1000,
        1000
      ]
    }
  ],
  "rawType": {
    "signed": false,
    "type": 1,
    "vlen": false,
    "littleEndian": true,
    "size": 4,
    "total_size": 144012000
  }
}

With for instance gzip compression it works just fine. I am also read the dataset again with the h5py library. Should h5web be able to read blosc and blosc2 compressed datasets out of the box?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions