Support bucket transform partitioning #670

tlegrave · 2026-01-06T08:20:23Z

tlegrave
Jan 6, 2026

Hello there 🦆

Ducklake looks really amazing — I'm looking forward to migrating our Iceberg stack to it.

However, we make intensive use of bucket transform partitioning in Iceberg, and I was wondering whether this is something you’re planning to support in Ducklake as well.

Our use case is essentially to partition ~6M elements identified by a business ID (with thousands of rows per ID). We need to batch‑access these elements, but there’s no particular business logic involved. So we currently use 200 buckets in Iceberg and are able to target a specific bucket when retrieving multiple elements.

A possible workaround would be to add a bucket column to the data (hash(business_id) % number_of_buckets) and use an IdentityTransform on it. But this feels a bit hacky and hard to maintain — especially if the number of buckets needs to change.

Thank you for the help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support bucket transform partitioning #670

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Support bucket transform partitioning #670

Uh oh!

tlegrave Jan 6, 2026

Replies: 0 comments

tlegrave
Jan 6, 2026