Skip to content

LibBlosc2: New codec #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ jobs:
- ChunkCodecCore/**
- ChunkCodecTests/**
- LibBlosc/**
LibBlosc2:
- ChunkCodecCore/**
- ChunkCodecTests/**
- LibBlosc2/**
LibBrotli:
- ChunkCodecCore/**
- ChunkCodecTests/**
Expand Down
11 changes: 11 additions & 0 deletions LibBlosc2/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Release Notes

All notable changes to this package will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## Unreleased

### Added

- Initial release
21 changes: 21 additions & 0 deletions LibBlosc2/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2025 Erik Schnetter

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
18 changes: 18 additions & 0 deletions LibBlosc2/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name = "ChunkCodecLibBlosc2"
uuid = "59b5581c-e2bc-42b3-a6f1-80e88eec7b70"
authors = ["Erik Schnetter <[email protected]>"]
version = "0.1.0"

[deps]
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"
Blosc2_jll = "d43303dc-dd0e-56c6-b0a8-331f4c8c9bfb"
ChunkCodecCore = "0b6fb165-00bc-4d37-ab8b-79f91016dbe1"

[compat]
Accessors = "0.1.42"
Blosc2_jll = "201.1700.100"
ChunkCodecCore = "0.5.0"
julia = "1.10"

[workspace]
projects = ["test"]
32 changes: 32 additions & 0 deletions LibBlosc2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# ChunkCodecLibBlosc2

## Warning: ChunkCodecLibBlosc2 is currently a WIP and its API may drastically change at any time.

This package implements the ChunkCodec interface for the following encoders and decoders
using the c-blosc2 library <https://github.com/Blosc/c-blosc2>

1. `Blosc2CFrame`, `Blosc2EncodeOptions`, `Blosc2DecodeOptions`

Note: It appears that the [Blosc2 Contiguous Frame
Format](https://www.blosc.org/c-blosc2/format/cframe_format.html) is
not fully protected by checksums. The [`c-blosc2`
library](https://www.blosc.org/c-blosc2) may crash (segfault) for
invalid inputs.

## Example

```julia-repl
julia> using ChunkCodecLibBlosc2

julia> data = collect(0x00:0x07);

julia> compressed_data = encode(Blosc2EncodeOptions(), data);

julia> decompressed_data = decode(Blosc2CFrame(), compressed_data; max_size=length(data), size_hint=length(data));

julia> data == decompressed_data
true
```

The low level interface is defined in the `ChunkCodecCore` package.

57 changes: 57 additions & 0 deletions LibBlosc2/src/ChunkCodecLibBlosc2.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
module ChunkCodecLibBlosc2

using Base.Threads

using Accessors: @reset

using Blosc2_jll: libblosc2

using ChunkCodecCore:
Codec,
EncodeOptions,
DecodeOptions,
check_in_range,
check_contiguous,
DecodingError
import ChunkCodecCore:
decode_options,
try_decode!,
try_encode!,
encode_bound,
try_find_decoded_size,
decoded_size_range

export Blosc2CFrame,
Blosc2EncodeOptions,
Blosc2DecodeOptions,
Blosc2DecodingError

if VERSION >= v"1.11.0-DEV.469"
eval(Meta.parse("public is_compressor_valid, compcode, compname"))
end

# reexport ChunkCodecCore
using ChunkCodecCore: ChunkCodecCore, encode, decode
export ChunkCodecCore, encode, decode

include("libblosc2.jl")

"""
struct Blosc2CFrame <: Codec
Blosc2CFrame()

Blosc2 compression using c-blosc2 library: https://github.com/Blosc2/c-blosc2

Decoding does not accept any extra data appended to the compressed block.
Decoding also does not accept truncated data, or multiple compressed blocks concatenated together.

[`Blosc2EncodeOptions`](@ref) and [`Blosc2DecodeOptions`](@ref)
can be used to set decoding and encoding options.
"""
struct Blosc2CFrame <: Codec end
decode_options(::Blosc2CFrame) = Blosc2DecodeOptions()

include("encode.jl")
include("decode.jl")

end # module ChunkCodecLibBlosc2
123 changes: 123 additions & 0 deletions LibBlosc2/src/decode.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
"""
Blosc2DecodingError()

Error for data that cannot be decoded.
"""
struct Blosc2DecodingError <: DecodingError
code::Cint
end

function Base.showerror(io::IO, err::Blosc2DecodingError)
print(io, "Blosc2DecodingError: blosc2 compressed buffer cannot be decoded, error code: $(err.code)")
return nothing
end

"""
struct Blosc2DecodeOptions <: DecodeOptions
Blosc2DecodeOptions(; kwargs...)

Blosc2 decompression using c-blosc2 library: https://github.com/Blosc/c-blosc2

# Keyword Arguments

- `codec::Blosc2CFrame = Blosc2CFrame()`
- `nthreads::Integer = 1`: The number of threads to use
"""
struct Blosc2DecodeOptions <: DecodeOptions
codec::Blosc2CFrame

nthreads::Int
end
function Blosc2DecodeOptions(; codec::Blosc2CFrame=Blosc2CFrame(),
nthreads::Integer=1,
kwargs...)
_nthreads = nthreads
check_in_range(1:typemax(Int32); nthreads=_nthreads)

return Blosc2DecodeOptions(codec, _nthreads)
end

function try_find_decoded_size(::Blosc2DecodeOptions, src::AbstractVector{UInt8})::Int64
check_contiguous(src)

blosc2_init()

copy_cframe = false
schunk = @ccall libblosc2.blosc2_schunk_from_buffer(src::Ptr{UInt8}, length(src)::Int64, copy_cframe::UInt8)::Ptr{Blosc2SChunk}
if schunk == Ptr{Blosc2Storage}()
# These are not a valid blosc2-encoded data
throw(Blosc2DecodingError(0))
end
@ccall libblosc2.blosc2_schunk_avoid_cframe_free(schunk::Ptr{Blosc2SChunk}, true::UInt8)::Cvoid

total_nbytes = unsafe_load(schunk).nbytes

success = @ccall libblosc2.blosc2_schunk_free(schunk::Ptr{Cvoid})::Cint
if success != 0
# Something went wrong
throw(Blosc2DecodingError(0))

Check warning on line 58 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L58

Added line #L58 was not covered by tests
end

return total_nbytes::Int64
end

# Note: We should implement `try_resize_decode!`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since try_find_decoded_size never returns nothing, there isn't a good reason to implement try_resize_decode!, the fallback is fine.


function try_decode!(d::Blosc2DecodeOptions, dst::AbstractVector{UInt8}, src::AbstractVector{UInt8};
kwargs...)::Union{Nothing,Int64}
check_contiguous(dst)
check_contiguous(src)

blosc2_init()

# I don't think there is a way to specify a decompression context.
# That means that our `Blosc2DecodeOptions` will be unused.
# We could try writing to the `dctx` field in the `schunk`.

copy_cframe = false
schunk = @ccall libblosc2.blosc2_schunk_from_buffer(src::Ptr{UInt8}, length(src)::Int64, copy_cframe::UInt8)::Ptr{Blosc2SChunk}
if schunk == Ptr{Blosc2Storage}()
# These are not a valid blosc2-encoded data
throw(Blosc2DecodingError(0))

Check warning on line 81 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L81

Added line #L81 was not covered by tests
end
@ccall libblosc2.blosc2_schunk_avoid_cframe_free(schunk::Ptr{Blosc2SChunk}, true::UInt8)::Cvoid

total_nbytes = unsafe_load(schunk).nbytes
if total_nbytes > length(dst)
# There is not enough space to decode the data
success = @ccall libblosc2.blosc2_schunk_free(schunk::Ptr{Cvoid})::Cint
if success != 0
# Something went wrong
throw(Blosc2DecodingError(0))

Check warning on line 91 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L91

Added line #L91 was not covered by tests
end

return nothing
end

dst_position = Int64(0)

nchunks = unsafe_load(schunk).nchunks
for nchunk in 0:(nchunks - 1)
nbytes_left = clamp(total_nbytes - dst_position, Int32)
nbytes = @ccall libblosc2.blosc2_schunk_decompress_chunk(schunk::Ptr{Blosc2SChunk}, nchunk::Int64,
pointer(dst, dst_position+1)::Ptr{Cvoid}, nbytes_left::Int32)::Cint
if nbytes <= 0
# There was an error decompressing the data
throw(Blosc2DecodingError(nbytes))

Check warning on line 106 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L106

Added line #L106 was not covered by tests
end

dst_position += nbytes
end
if dst_position != total_nbytes
# The decompressed size is inconsistent
throw(Blosc2DecodingError(0))

Check warning on line 113 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L113

Added line #L113 was not covered by tests
end

success = @ccall libblosc2.blosc2_schunk_free(schunk::Ptr{Cvoid})::Cint
if success != 0
# Something went wrong
throw(Blosc2DecodingError(0))

Check warning on line 119 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L119

Added line #L119 was not covered by tests
end

return total_nbytes::Int64
end
Loading