-
Notifications
You must be signed in to change notification settings - Fork 13
Description
A number of binary formats contain dictionary encoded data with the references as 0-indexed integers. PooledArrays currently can't be used to wrap these because, if I understand correctly, the references are always 1-based. This means that to deserialize a format using PooledArrays you necessarily have to copy all the references.
I suggest we add an offset::Int
field so this can be handled more generally. This would be added getindex
. The most obvious difficulty with implementing this is that currently zeros give undefined values. Perhaps this can be circumvented at compile time with a new type parameter.
Anyway, before I try to implement this, has anyone given in consideration? How is arrow (which I seem to remember is 0-based) deal with this?