-
Notifications
You must be signed in to change notification settings - Fork 999
Open
Labels
Description
Describe the bug
In Parquet, page size cannot exceeds i32, since it uses thirft to store uncompressed_page_size
and compressed_page_size
.
See: https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L802
It's unlikely to happen, since arrow-rs change page-size to 1MiB by default. However, when we enlarge batch-size and page size limit, it's likely to happen
To Reproduce
Trying to write huge blob to parquet
Expected behavior
Switching to smaller boundery > Throw error > Leaving bad parquet page
Additional context
No