-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Closed
Description
Describe the enhancement requested
Right now, the unpack family of functions extract fewer elements than requested.
This is because it relies on batch extraction that must process many inputs at once.
Instead the BitReader::GetBatch is responsible for handling inputs before (prolog) and after (epilog) unpack.
This has two downsides:
- It makes the general parquet C++ logic harder to understand, as related functions are spread apart;
- I makes
unpackharder to (re)use as it does not fully extract all that is needed. In particular, it makes it hard to iterate on these functions because the tests/benchmarks would need to adapt to the number of element that the function can work with.
The prolog and epilog should be moved to the unpack functions so that one function is fully responsible for unpacking integers without extra complexity.
Component(s)
C++, Parquet
Reactions are currently unavailable