Skip to content

refactor: ExtensionArray from DataArray to logical array backed by Series #6408

@universalmind303

Description

@universalmind303

Is your feature request related to a problem?

After the arrow-rs migration, ExtensionArray is DataArray<ExtensionType> which treats it as a physical type. But there's no DataType::Extension in arrow-rs — extension types are a logical concept (metadata over a storage type). The current representation doesn't reflect this.

Describe the solution you'd like

Refactor ExtensionArray from DataArray<ExtensionType> to a custom struct backed by Series:

pub struct ExtensionArray {
    pub field: Arc<Field>,
    pub physical: Series, // the storage-typed data
}

Operations delegate to the inner Series, which handles runtime dispatch to the correct physical type. ExtensionType becomes non-physical (removed from DaftPhysicalType/DaftArrowBackedType), and DataType::Extension is added to is_logical().

Describe alternatives you've considered

Using LogicalArrayImpl<ExtensionType> with a fixed physical type, but Extension's storage type varies per instance (could be Int64, Utf8, FixedSizeList, etc.), which breaks the compile-time fixed PhysicalType requirement.

Additional Context

Prerequisite work for the public ExtensionType API (#6396). Casting to/from extension types is already implemented on this branch.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions