Skip to content

LibBlosc2: New codec #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ jobs:
- ChunkCodecCore/**
- ChunkCodecTests/**
- LibBlosc/**
LibBlosc2:
- .github/**
- ChunkCodecCore/**
- ChunkCodecTests/**
- LibBlosc2/**
LibBrotli:
- .github/**
- ChunkCodecCore/**
Expand Down
11 changes: 11 additions & 0 deletions LibBlosc2/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Release Notes

All notable changes to this package will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## Unreleased

### Added

- Initial release
21 changes: 21 additions & 0 deletions LibBlosc2/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2025 Erik Schnetter

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
18 changes: 18 additions & 0 deletions LibBlosc2/Project.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name = "ChunkCodecLibBlosc2"
uuid = "59b5581c-e2bc-42b3-a6f1-80e88eec7b70"
authors = ["Erik Schnetter <[email protected]>"]
version = "0.1.0"

[deps]
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"
Blosc2_jll = "d43303dc-dd0e-56c6-b0a8-331f4c8c9bfb"
ChunkCodecCore = "0b6fb165-00bc-4d37-ab8b-79f91016dbe1"

[compat]
Accessors = "0.1.42"
Blosc2_jll = "201.1700.100"
ChunkCodecCore = "0.5.0"
julia = "1.10"

[workspace]
projects = ["test"]
26 changes: 26 additions & 0 deletions LibBlosc2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# ChunkCodecLibBlosc2

## Warning: ChunkCodecLibBlosc2 is currently a WIP and its API may drastically change at any time.

This package implements the ChunkCodec interface for the following encoders and decoders
using the c-blosc2 library <https://github.com/Blosc/c-blosc2>

1. `Blosc2Codec`, `Blosc2EncodeOptions`, `Blosc2DecodeOptions`

## Example

```julia-repl
julia> using ChunkCodecLibBlosc2

julia> data = [0x00, 0x01, 0x02, 0x03];

julia> compressed_data = encode(Blosc2EncodeOptions(), data);

julia> decompressed_data = decode(Blosc2Codec(), compressed_data; max_size=length(data), size_hint=length(data));

julia> data == decompressed_data
true
```

The low level interface is defined in the `ChunkCodecCore` package.

58 changes: 58 additions & 0 deletions LibBlosc2/src/ChunkCodecLibBlosc2.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
module ChunkCodecLibBlosc2

using Base.Libc: free
using Base.Threads

using Accessors

using Blosc2_jll: libblosc2

using ChunkCodecCore:
Codec,
EncodeOptions,
DecodeOptions,
check_in_range,
check_contiguous,
DecodingError
import ChunkCodecCore:
decode_options,
try_decode!,
try_encode!,
encode_bound,
try_find_decoded_size,
decoded_size_range

export Blosc2Codec,
Blosc2EncodeOptions,
Blosc2DecodeOptions,
Blosc2DecodingError

if VERSION >= v"1.11.0-DEV.469"
eval(Meta.parse("public is_compressor_valid, compcode, compname"))
end

# reexport ChunkCodecCore
using ChunkCodecCore: ChunkCodecCore, encode, decode
export ChunkCodecCore, encode, decode

include("libblosc2.jl")

"""
struct Blosc2Codec <: Codec
Blosc2Codec()

Blosc2 compression using c-blosc2 library: https://github.com/Blosc2/c-blosc2

Decoding does not accept any extra data appended to the compressed block.
Decoding also does not accept truncated data, or multiple compressed blocks concatenated together.

[`Blosc2EncodeOptions`](@ref) and [`Blosc2DecodeOptions`](@ref)
can be used to set decoding and encoding options.
"""
struct Blosc2Codec <: Codec end
decode_options(::Blosc2Codec) = Blosc2DecodeOptions()

include("encode.jl")
include("decode.jl")

end # module ChunkCodecLibBlosc2
114 changes: 114 additions & 0 deletions LibBlosc2/src/decode.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
"""
Blosc2DecodingError()

Error for data that cannot be decoded.
"""
struct Blosc2DecodingError <: DecodingError
end

function Base.showerror(io::IO, err::Blosc2DecodingError)
print(io, "Blosc2DecodingError: blosc2 compressed buffer cannot be decoded")
return nothing
end

"""
struct Blosc2DecodeOptions <: DecodeOptions
Blosc2DecodeOptions(; kwargs...)

Blosc2 decompression using c-blosc2 library: https://github.com/Blosc/c-blosc2

# Keyword Arguments

- `codec::Blosc2Codec=Blosc2Codec()`

# Keyword Arguments

- `codec::Blosc2Codec=Blosc2Codec()`
- `nthreads::Integer=1`: The number of threads to use
"""
struct Blosc2DecodeOptions <: DecodeOptions
codec::Blosc2Codec

nthreads::Int
end
function Blosc2DecodeOptions(; codec::Blosc2Codec=Blosc2Codec(),
nthreads::Integer=1,
kwargs...)
_nthreads = nthreads
check_in_range(1:typemax(Int32); nthreads=_nthreads)

return Blosc2DecodeOptions(codec, _nthreads)
end

# This decoder is thread safe: We don't use any of Blosc2's global variables.
is_thread_safe(::Blosc2DecodeOptions) = true

Check warning on line 44 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L44

Added line #L44 was not covered by tests

function try_find_decoded_size(::Blosc2DecodeOptions, src::AbstractVector{UInt8})::Int64
check_contiguous(src)

blosc2_init()

copy_cframe = false
schunk = @ccall libblosc2.blosc2_schunk_from_buffer(src::Ptr{UInt8}, length(src)::Int64, copy_cframe::UInt8)::Ptr{Blosc2SChunk}
if schunk == Ptr{Blosc2Storage}()
# These are not a valid blosc2-encoded data
throw(Blosc2DecodingError())
end
@ccall libblosc2.blosc2_schunk_avoid_cframe_free(schunk::Ptr{Blosc2SChunk}, true::UInt8)::Cvoid

total_nbytes = unsafe_load(schunk).nbytes

success = @ccall libblosc2.blosc2_schunk_free(schunk::Ptr{Cvoid})::Cint
@assert success == 0

return total_nbytes::Int64
end

#TODO: implement `try_resize_decode!`

function try_decode!(d::Blosc2DecodeOptions, dst::AbstractVector{UInt8}, src::AbstractVector{UInt8};
kwargs...)::Union{Nothing,Int64}
check_contiguous(dst)
check_contiguous(src)

blosc2_init()

# I don't think there is a way to specify a decompression context.
# That means that our `Blosc2DecodeOptions` will be unused.
# We could try writing to the `dctx` field in the `schunk`.

copy_cframe = false
schunk = @ccall libblosc2.blosc2_schunk_from_buffer(src::Ptr{UInt8}, length(src)::Int64, copy_cframe::UInt8)::Ptr{Blosc2SChunk}
if schunk == Ptr{Blosc2Storage}()
# These are not a valid blosc2-encoded data
throw(Blosc2DecodingError())

Check warning on line 84 in LibBlosc2/src/decode.jl

View check run for this annotation

Codecov / codecov/patch

LibBlosc2/src/decode.jl#L84

Added line #L84 was not covered by tests
end
@ccall libblosc2.blosc2_schunk_avoid_cframe_free(schunk::Ptr{Blosc2SChunk}, true::UInt8)::Cvoid

total_nbytes = unsafe_load(schunk).nbytes
if total_nbytes > length(dst)
# There is not enough space to decode the data
success = @ccall libblosc2.blosc2_schunk_free(schunk::Ptr{Cvoid})::Cint
@assert success == 0

return nothing
end

dst_position = Int64(0)

nchunks = unsafe_load(schunk).nchunks
for nchunk in 0:(nchunks - 1)
nbytes_left = clamp(total_nbytes - dst_position, Int32)
nbytes = @ccall libblosc2.blosc2_schunk_decompress_chunk(schunk::Ptr{Blosc2SChunk}, nchunk::Int64,
pointer(dst, dst_position+1)::Ptr{Cvoid}, nbytes_left::Int32)::Cint
@assert nbytes > 0

dst_position += nbytes
end
@assert dst_position == total_nbytes

success = @ccall libblosc2.blosc2_schunk_free(schunk::Ptr{Cvoid})::Cint
@assert success == 0

return total_nbytes::Int64
end
Loading
Loading