-
Notifications
You must be signed in to change notification settings - Fork 154
Open
Description
Feature Request / Improvement
GetWriteProperties() hardcodes parquet.WithDataPageVersion(parquet.DataPageV2) with no way to override it. This causes issues with consumers that don't fully support DataPage V2 (e.g. Snowflake).
iceberg-java supports configuring this via WriteBuilder.writerVersion(), but iceberg-go has no equivalent.
I checked a few Iceberg library implementations, and iceberg-go is the only one I found that defaults to DataPage V2:
- iceberg-java hardcodes
WriterVersion.PARQUET_1_0which produces V1 pages - pyiceberg doesn't set
data_page_version, so PyArrow defaults to V1 - iceberg-rust doesn't set it, so arrow-rs defaults to V1
I propose that we add a way to configure DataPage version, similar to iceberg-java's WriteBuilder.writerVersion(). I think we should still keep the current default (V2) for backward compatibility.
I'm happy to discuss the best place to expose this in the API and to submit a PR for this if the approach is approved!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels