Skip to content

Possibly merge RegularChunks and IrregularChunks into a single struct #253

@meggart

Description

@meggart

I am currently working on improving the performance and type-stability of the ConcatDiskArray constructor and one major unavoidable source of type instability is that we can not know a priori if the resulting array will have regular or irregular chunks. There would be 2 ways to work around this:

  1. always assume the resulting array will have irregular chunks, this would imply (moderate) performance penalties for chunk lookups when the resulting concat array would have regular chunks
  2. merge RegularChunks and IrregularChunks into a single struct which works like a hand-coded union and that might look like this:
struct Chunks
is_regular::Bool
arraysize::Int
chunksize::Int
offset::Int
irregular_offsets::Vector{Int}
end

Depending on if we construct a regular or irregular chunk object along a dimension, the is_regular flag is set and either chunksize and offset or the irregular_offsets fields are filled with meaningful values. All getindex methods would need to dynamically check the is_regular flag, which should be cheap and then call the respective routines. As a result, we would be back to a single ChunkVector representation, we could remove the second type parameter from ChunkVector and have the whole chunk machinery quite inherently type-stable. Would also be interesting in the light of movement towards more type-stable IO APIs compatible with static compilation and experiments like this one: https://github.com/gbaraldi/StaticHDF5.jl @rafaqz what do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions