-
Notifications
You must be signed in to change notification settings - Fork 106
Closed
Description
Minimal, reproducible code sample, a copy-pastable example if possible
# Jenkin's lookup3 can be useful for verifying HDF5 data structures such as the superblock below
import jenkins_cffi
with open("original_hdf5_zarr_shard_demo.h5", "rb") as f:
b = f.read(48)
hash_bytes = jenkins_cffi.hashlittle(bytes(b[:-4])).to_bytes(4, "little")
print(b[-4:] == hash_bytes) # True
Problem description
Jenkin's lookup3 is an integral component of the HDF5 specification for its internal datastructures.
This becomes relevant if we would like to reuse HDF5 data structures. For example, the HDF5 Fixed Array Data Block can made byte compatible with the proposed Zarr shard specification, except for the four byte checksum. Currently, the only permitted checksum is crc32.
zarr-developers/zarr-specs#152 (comment)
An implementation of Bob Jenkin's lookup3 is widely available across many languages.
- https://en.wikipedia.org/wiki/Jenkins_hash_function
- https://www.burtleburtle.net/bob/hash/doobs.html
- C: https://www.burtleburtle.net/bob/c/lookup3.c
- C: https://github.com/HDFGroup/hdf5/blob/3af8bb267d6ad2d4e8e5d77e16d5a8f7625ad34d/src/H5checksum.c#L364-L458
- Python: https://pypi.org/project/jenkins/ (old PyPi package)
- Python: https://pypi.org/project/jenkins-cffi/
- JavaScript: https://www.npmjs.com/package/jenkins-hash-lookup3
- Julia: https://github.com/JuliaIO/JLD2.jl/blob/master/src/Lookup3.jl
Version and installation information
Please provide the following:
jenkins-cffi 1.0.2.1 pypi_0 pypi
python 3.11.0 he550d4f_1_cpython conda-forge
Metadata
Metadata
Assignees
Labels
No labels