Skip to content

Reuse IcebergPageSourceProvider across splits for a scan in Lakehouse#28507

Open
raunaqmorarka wants to merge 1 commit intotrinodb:masterfrom
raunaqmorarka:raunaq/lhps
Open

Reuse IcebergPageSourceProvider across splits for a scan in Lakehouse#28507
raunaqmorarka wants to merge 1 commit intotrinodb:masterfrom
raunaqmorarka:raunaq/lhps

Conversation

@raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Mar 3, 2026

Description

This is necessary to reuse equality deletes in an iceberg scan across splits in a worker

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Lakehouse
* Improved performance and memory usage when [Equality Delete](https://iceberg.apache.org/spec/#equality-delete-files) files are used ({issue}`28507`)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Lakehouse connector’s ConnectorPageSourceProviderFactory to reuse a single Iceberg page source provider instance across splits within the same scan, enabling reuse of equality delete state (and associated performance/memory improvements) during Iceberg reads.

Changes:

  • Wrap createPageSourceProvider() with a stateful ConnectorPageSourceProvider that caches the chosen delegate provider.
  • Reuse the same Iceberg page source provider across multiple createPageSource calls (splits) to allow equality delete reuse.
  • Add getMemoryUsage() delegation with a 0 value before any page source is created.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This is necessary to reuse equality deletes in an iceberg scan across splits in a worker
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants