fix: Create file groups for partitions before applying filters #112

nagraham · 2026-01-05T19:43:19Z

See the related issue: #111

Problem

We have observed compaction run again and again for tables which have not added or removed data, even for days at a time. We would expect the compaction job to produce files that are either larger than our configured max size, or fewer than the min_group_file_count, so that subsequent jobs would not run at all. This is impacting tables which are partitioned.

Solution

The ideal approach would be to update the strategies in file_selection/strategy.rs to group files by partition using partition information in the FileScanTask. Unfortunately, partition information is not in the forked iceberg-rust (yet). That is a recently added attribute: link to task.rs.

Approaches

(SUGGESTED) Option A: Cherry-pick the commit adding partition info into the forked iceberg-rust and then group by partition info from the FileScanTask
- (+) This is an ideal long term solution because it uses partition information in the FileScanTask struct.
- (-) One risk is that it may have a lot of complex conflicts to resolve, which could delay getting out a fix
  - UPDATE: I cherry-picked that commit into the rising-wave fork of iceberg-rust, and resolved the conflicts. So this may not be a risk.
Option B: Get partition information from datafile metadata in Manifest files, and pass a mapping into the Strategy.execute(). Compare a FileScanTask's datapath with the path in the mapp.
- (+) We can do this now without upgrading iceberg-rust
- (-) It adds annoying mapping code / complexity
- (-) Temporary solution. Would move to Option A later once iceberg-rust is up to date.
- (-) It adds a bit more I/O to get manifest files

Implemented Option A

This PR implements Option a, using Partition values to group up FileScanTask if the Table has a partition.

Testing

I wrote an initial commit with an integration test that tests the expectation that the min_group_file_count would filter out a table where the number of files within each partition is less than the min_group_file_count. Without a fix, it fails, and thus reproduces the issue. With the fix, it passes.

Here is an example of the failing test:

---- integration_tests::test_min_files_in_group_applies_to_partitioned_table stdout ----

thread 'integration_tests::test_min_files_in_group_applies_to_partitioned_table' panicked at integration-tests/src/integration_tests.rs:460:5:
Compaction should NOT have re-run compaction because the files within each partition are less than the min_group_file_count; stats: RewriteFilesStat { input_files_count: 5, output_files_count: 5, input_total_bytes: 48312, output_total_bytes: 48312, input_data_file_count: 5, input_position_delete_file_count: 0, input_equality_delete_file_count: 0, input_data_file_total_bytes: 48312, input_position_delete_file_total_bytes: 0, input_equality_delete_file_total_bytes: 0 }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    integration_tests::test_min_files_in_group_applies_to_partitioned_table

test result: FAILED. 2 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 4.66s

Bonus: This change also updates the integ test library to run on Mac/Windows machines.

…ables

… in FileScanTask

…test

Li0k · 2026-01-07T18:28:54Z

Thanks for the PR, I will review it ASAP

…ion key

nagraham · 2026-01-08T22:43:12Z

core/src/executor/iceberg_writer/rolling_iceberg_writer.rs

            max_concurrent_closes: self
                .max_concurrent_closes
                .unwrap_or(DEFAULT_MAX_CONCURRENT_CLOSES),
-            partition_key: self.partition_key,


We discovered this triggers a panic deeper in iceberg-rust. The root cause is self.partition_key is set to None. However, the RecordBatchPartitionSplitter will provide a partition_key when build() is invoked. We should use that partition_key instead. By using None, we have invalid partition data. I think the first instance works because the initial writer has the correct partition_key.

This triggered a panic in construct_partition_summaries due to the iterators not having the same length. Long term, iceberg-rust should probably return a better error rather than panic. But perhaps panicking is better than writing corrupted data.

Truncated stack trace:

itertools: .zip_eq() reached end of one iterator before the other iceberg::spec::manifest::writer::ManifestWriter::construct_partition_summaries iceberg::spec::manifest::writer::ManifestWriter::write_manifest_file iceberg::transaction::snapshot::SnapshotProducer::write_added_manifest iceberg::transaction::rewrite_files::RewriteFilesAction::commit

I wrote a 2nd integration test and verified that it triggers a panic without this modification. It also demonstrates the conditions which trigger the error:

The table must be partitioned

At least one partition must have a group of input files which is larger than the target_size, and thus triggers a "roll over" to a new output file

nagraham added 3 commits January 5, 2026 11:49

Update integration tests to run from MacOS laptop

549c5bc

Implement integ test for min_group_file_count filter on partitioned t…

74a8303

…ables

Fix clippy warnings

baadbe9

hzxa21 requested review from Li0k and xxhZs January 6, 2026 15:12

nagraham mentioned this pull request Jan 6, 2026

Title: Cherry-pick: Add PartitionSpec support to FileScanTask (upstream #1821) risingwavelabs/iceberg-rust#107

Merged

nagraham added 4 commits January 6, 2026 13:14

Temporarily point to my fork of iceberg-rust; includes partition info…

e8bb6d2

… in FileScanTask

Make Single and BinPack grouping strategies partition-aware

072a94b

Switch back to risingwavelabs/iceberg-rust, and bump commit sha to la…

1fcd405

…test

Add unit tests for grouping stratgies with partitioned files

b2f4752

nagraham marked this pull request as ready for review January 7, 2026 17:51

Fix panic caused by the RollingIcebergWriter not using correct partit…

61db1b0

…ion key

nagraham commented Jan 8, 2026

View reviewed changes

Fix flakey integration test

ccf30c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Create file groups for partitions before applying filters #112

fix: Create file groups for partitions before applying filters #112

Uh oh!

nagraham commented Jan 5, 2026 •

edited

Loading

Uh oh!

Li0k commented Jan 7, 2026

Uh oh!

nagraham Jan 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Create file groups for partitions before applying filters #112

Are you sure you want to change the base?

fix: Create file groups for partitions before applying filters #112

Uh oh!

Conversation

nagraham commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Approaches

Implemented Option A

Testing

Uh oh!

Li0k commented Jan 7, 2026

Uh oh!

nagraham Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nagraham commented Jan 5, 2026 •

edited

Loading

nagraham Jan 8, 2026 •

edited

Loading