-
Notifications
You must be signed in to change notification settings - Fork 21
Description
I am currently working on improving the performance and type-stability of the ConcatDiskArray
constructor and one major unavoidable source of type instability is that we can not know a priori if the resulting array will have regular or irregular chunks. There would be 2 ways to work around this:
- always assume the resulting array will have irregular chunks, this would imply (moderate) performance penalties for chunk lookups when the resulting concat array would have regular chunks
- merge
RegularChunks
andIrregularChunks
into a single struct which works like a hand-coded union and that might look like this:
struct Chunks
is_regular::Bool
arraysize::Int
chunksize::Int
offset::Int
irregular_offsets::Vector{Int}
end
Depending on if we construct a regular or irregular chunk object along a dimension, the is_regular
flag is set and either chunksize
and offset
or the irregular_offsets
fields are filled with meaningful values. All getindex methods would need to dynamically check the is_regular flag
, which should be cheap and then call the respective routines. As a result, we would be back to a single ChunkVector representation, we could remove the second type parameter from ChunkVector
and have the whole chunk machinery quite inherently type-stable. Would also be interesting in the light of movement towards more type-stable IO APIs compatible with static compilation and experiments like this one: https://github.com/gbaraldi/StaticHDF5.jl @rafaqz what do you think?