Skip to content

Conversation

stevie9868
Copy link
Contributor

Rationale for this change

In java, snapshotProducer can create stageOnly snapshot meaning only the snapshot is created but the snapshot is not set to a ref.

This is a prerequisite to support wap.id in py-iceberg

Are these changes tested?

Yes, tests are added.

Are there any user-facing changes?

By default, it will stay with the current existing behavior.

Comment on lines 121 to 122
branch: str = MAIN_BRANCH,
stage_only: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think API wise, this makes more sense. In the case branch is None then we don't set the ref, if it is not None then we set the ref.

Suggested change
branch: str = MAIN_BRANCH,
stage_only: bool = False,
branch: Optional[str] = MAIN_BRANCH

@kevinjqliu WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that make sense to me. Instead of setting stage_only=True, it would be branch=None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, thanks for taking a look!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it a bit more, I think that's the way forward.

def update_snapshot(self, snapshot_properties: Dict[str, str] = EMPTY_DICT, branch: Optional[str] = None) -> UpdateSnapshot:
"""Create a new UpdateSnapshot to produce a new snapshot for the table.
Returns:
A new UpdateSnapshot
"""
if branch is None:
branch = MAIN_BRANCH

We would change that into:

    def update_snapshot(self, snapshot_properties: Dict[str, str] = EMPTY_DICT, branch: Optional[str] = MAIN_BRANCH) -> UpdateSnapshot:
        """Create a new UpdateSnapshot to produce a new snapshot for the table.

        Returns:
            A new UpdateSnapshot
        """

There are a couple more places where we need to change the default to MAIN_BRANCH. Let me know what you think!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense.
I will take another deeper look to make sure it is backward compatible.

@stevie9868 stevie9868 force-pushed the yingjianw/stageOnlyCommit branch from 9bec165 to 884eca9 Compare August 24, 2025 18:50
@Fokko
Copy link
Contributor

Fokko commented Aug 25, 2025

@stevie9868 Thanks for resuming the work on this. I did a quick check over the changes, and it looks good to me. Could you resolve the conflicts? For some reason, the CI did not trigger.

@stevie9868 stevie9868 force-pushed the yingjianw/stageOnlyCommit branch 2 times, most recently from dad5a1e to 0250e24 Compare August 26, 2025 05:36
@stevie9868 stevie9868 force-pushed the yingjianw/stageOnlyCommit branch from 8e1a846 to 08dee72 Compare August 26, 2025 05:46
@stevie9868
Copy link
Contributor Author

@Fokko

Thanks for the quick review, and I have rebased my changes to resolve the conflicts.

@stevie9868 stevie9868 requested review from Fokko and kevinjqliu August 31, 2025 22:37
@@ -807,7 +806,7 @@ def upsert(
case_sensitive=case_sensitive,
)

if branch is not None:
if branch in self.table_metadata.refs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we silently ignore the branch if it doesn't exists. I think it would be better to raise a ValueError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was thinking for upsert, we also might want to support staging commit as well so I remove the exception.
Let me know what you think!

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one minor comment, but apart from that, this looks good to me! 🙌 Thanks for working on this @stevie9868

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants