[SPARK-53799][SPARK-53800][SS][DSV2] Enable column pruning and predicate pushdown in DSV2 streaming #52516

gengliangwang · 2025-10-03T23:22:30Z

What changes were proposed in this pull request?

Enable column pruning and predicate pushdown in DSV2 streaming.
The pushdown happens during analysis instead of relying on the optimizer. The streaming execution needs an actual V2 Scan early so we can materialize a SparkDataStream via Scan.toMicroBatchStream or Scan.toContinuousStream.

Why are the changes needed?

To reduce data read and compute in streaming queries by pushing filters and projecting only needed columns into DSv2 readers, aligning streaming with batch DSv2 capabilities.

Does this PR introduce any user-facing change?

No

How was this patch tested?

New unit tests

Was this patch authored or co-authored using generative AI tooling?

No

gengliangwang · 2025-10-03T23:25:58Z

cc @jerrypeng

support predicate pushdown and column pruning in dsv2 streaming

6f2dc09

github-actions bot added SQL STRUCTURED STREAMING labels Oct 3, 2025

gengliangwang requested review from cloud-fan and aokolnychyi October 3, 2025 23:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-53799][SPARK-53800][SS][DSV2] Enable column pruning and predicate pushdown in DSV2 streaming #52516

[SPARK-53799][SPARK-53800][SS][DSV2] Enable column pruning and predicate pushdown in DSV2 streaming #52516

gengliangwang commented Oct 3, 2025

Uh oh!

gengliangwang commented Oct 3, 2025

Uh oh!

Uh oh!

[SPARK-53799][SPARK-53800][SS][DSV2] Enable column pruning and predicate pushdown in DSV2 streaming #52516

Are you sure you want to change the base?

[SPARK-53799][SPARK-53800][SS][DSV2] Enable column pruning and predicate pushdown in DSV2 streaming #52516

Conversation

gengliangwang commented Oct 3, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

gengliangwang commented Oct 3, 2025

Uh oh!

Uh oh!