-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Some libraries such as zlib and zstd support using external dictionaries to improve the compression performance for small files: https://facebook.github.io/zstd/zstd_manual.html#Chapter10
The same dictionary is then required to decode the data.
This can be implemented with the current API by adding a dictionary field to the Codec
struct, but there are some complications.
- When encoding, the raw dictionary isn't directly useful. It first needs to be digested. It would be nice to cache the digested dictionary somewhere because often the same dictionary is used repeatedly.
- Sometimes dictionaries have an associated ID, and the encoded data has a dictionary ID stored in a header. I think the idea is that a decoder could have multiple dictionaries, and then pick one to use based on the ID in the header, though I'm not sure how this would work as part of a larger format like Zarr.
Metadata
Metadata
Assignees
Labels
No labels