Skip to content

[spark] Add max_pt function#5088

Merged
JingsongLi merged 2 commits intoapache:masterfrom
ulysses-you:max_pt
Feb 16, 2025
Merged

[spark] Add max_pt function#5088
JingsongLi merged 2 commits intoapache:masterfrom
ulysses-you:max_pt

Conversation

@ulysses-you
Copy link
Copy Markdown
Contributor

@ulysses-you ulysses-you commented Feb 14, 2025

Purpose

Adds a new Spark function max_pt. It accpets a string type literal and return a max-valid-toplevel partition value.

  • valid means the partition contains data files
  • toplevel means only return the first partition value if the table has multi-partition columns

max_pt will throw exception when:

  • the table is not a partitioned table
  • the partitioned table does not have partition
  • all of the partitions do not contains data files

This pr adds max_pt through spark v2 function catalog using a fake scalar function and then adds a rule to replace max_pt to a literal value during analysis.

Example:

SELECT max_pt('t')
=>
20250101

API and Format

no

Documentation

add sql-functions docs

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a new documentation: sql-function?

@ulysses-you
Copy link
Copy Markdown
Contributor Author

@JingsongLi added docs

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 6f4ad34 into apache:master Feb 16, 2025
12 of 13 checks passed
@ulysses-you ulysses-you deleted the max_pt branch February 17, 2025 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants