Skip to content

Conversation

@ericm-db
Copy link
Contributor

@ericm-db ericm-db commented Jan 6, 2026

What changes were proposed in this pull request?

This PR adds the sourceIdentifyingName parameter to StreamingRelationV2 and introduces the HasStreamingSourceIdentifyingName trait to provide a uniform interface for streaming source naming.

Key changes:

  • Added HasStreamingSourceIdentifyingName trait in StreamingSourceIdentifyingName.scala
  • Added sourceIdentifyingName parameter to StreamingRelationV2 case class with default value Unassigned
  • Implemented withSourceIdentifyingName method to support name propagation
  • Updated all pattern matches for StreamingRelationV2 in sql/core to account for new parameter
  • Updated explain test golden file to include the new parameter in output

Why are the changes needed?

This change lays the groundwork for streaming source evolution by enabling sources to have stable identifying names. The parameter will be used by analyzer rules to assign and propagate names from DataFrame API calls, allowing sources to maintain their checkpoint state even when sources are added, removed, or reordered in a query.

Does this PR introduce any user-facing change?

No. The parameter is added with a default value (Unassigned) that maintains backward compatibility. No user-facing APIs are modified.

How was this patch tested?

  • Updated ProtoToPlan explain test to verify the parameter appears correctly in explain output
  • Verified pattern matches compile correctly with the new parameter
  • All existing tests pass with the new parameter

Was this patch authored or co-authored using generative AI tooling?

No.

…ationV2

## What changes were proposed in this pull request?

This PR adds the `sourceIdentifyingName` parameter to `StreamingRelationV2` and introduces the `HasStreamingSourceIdentifyingName` trait to provide a uniform interface for streaming source naming.

Key changes:
- Added `HasStreamingSourceIdentifyingName` trait in `StreamingSourceIdentifyingName.scala`
- Added `sourceIdentifyingName` parameter to `StreamingRelationV2` case class with default value `Unassigned`
- Implemented `withSourceIdentifyingName` method to support name propagation
- Updated all pattern matches for `StreamingRelationV2` in sql/core to account for new parameter
- Updated explain test golden file to include the new parameter in output

## Why are the changes needed?

This change lays the groundwork for streaming source evolution by enabling sources to have stable identifying names. The parameter will be used by analyzer rules to assign and propagate names from DataFrame API calls, allowing sources to maintain their checkpoint state even when sources are added, removed, or reordered in a query.

## Does this PR introduce _any_ user-facing change?

No. The parameter is added with a default value (`Unassigned`) that maintains backward compatibility. No user-facing APIs are modified.

## How was this patch tested?

- Updated ProtoToPlan explain test to verify the parameter appears correctly in explain output
- Verified pattern matches compile correctly with the new parameter
- All existing tests pass with the new parameter

## Was this patch authored or co-authored using generative AI tooling?

No.
@github-actions
Copy link

github-actions bot commented Jan 6, 2026

JIRA Issue Information

=== Task SPARK-54910 ===
Summary: Add streamingSourceIdentifyingName field to StreamingRelationV2
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@ericm-db
Copy link
Contributor Author

ericm-db commented Jan 6, 2026

cc @HeartSaVioR Can you PTAL?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants