Skip to content

Comments

Spark: Support variant_get predicate pushdown for file skipping#15385

Open
qlong wants to merge 1 commit intoapache:mainfrom
qlong:variant-file-skipping-sparkv2filters
Open

Spark: Support variant_get predicate pushdown for file skipping#15385
qlong wants to merge 1 commit intoapache:mainfrom
qlong:variant-file-skipping-sparkv2filters

Conversation

@qlong
Copy link

@qlong qlong commented Feb 20, 2026

This is to support manifest-based file skipping for variant columns.

Changes:

  • SparkV2Filters: Convert variant_get/try_variant_get to Expressions.extract()
  • Spark3Util.describe: Output extract terms as variant_get() for EXPLAIN

Tests:

  • Added unit tests
  • Manual e2e testing with spark-sql built with dependence PRs, verified variant_get is pushdown to iceberg for file skipping. Verified that files is skipped from Spark history.

The PR depends on:

  1. Api: Support variant extract and fix manifest bounds byte order #15384: variant bound fix
  2. Spark: Support writing shredded variant in Iceberg-Spark #14297: shredded variant support for Spark @aihuaxu
  3. [SPARK-55617] Add VariantGet to V2ExpressionBuilder for DSv2 filter pushdown spark#54394: Spark side change to add VariantGet to DSv2 filter

This PR can be safely merged once the 1st dependency PR is merged.

- SparkV2Filters: Convert variant_get/try_variant_get to
  Expressions.extract()
- Spark3Util.describe: Output extract terms as variant_get() for EXPLAIN
- Add tests for both

Depends on Spark PR: apache/spark#54394
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant