Skip to content

Conversation

timsaucer
Copy link
Member

Which issue does this PR close?

None

Rationale for this change

Right now we have some minor confusion when importing from the udf module because at the root of datafusion we also have a udf method. This is a minor inconvenience, but it has shown up a few times where a user wants to import one or the other. This removes the uncertainty by changing the udf module to user_defined. Also since this module includes scalar, aggregate, and window functions it is a cleaner organization.

What changes are included in this PR?

  • Move the datafusion.udf module to datafusion.user_defined. This is a breaking change. Fortunately most historical users have been using the helper functions which exist in the root module.
  • Correct documentation build errors. Specifically there were examples in markdown instead of rst format.
  • Update the text of the UDF documentation to be more consistent with other portions of the code base.

Are there any user-facing changes?

Users will need to update their module imports from datafusion.udf to datafusion.user_defined. The datafusion.udf function is NOT impacted.

… rest of the site and to correct some errors when building the rst files using autoapi
@timsaucer
Copy link
Member Author

@crystalxyz FYI this does touch on the documentation you wrote. Please let me know if you think I should change back the formatting. I did need to switch from markdown to rst format for our documentation build system.

@crystalxyz
Copy link
Contributor

@timsaucer The documentation changes look good to me! Do you think that adding a comment in python/datafusion/user_defined to explain the renaming would be helpful? People reading this file might get confused.

@timsaucer timsaucer mentioned this pull request Apr 27, 2025
4 tasks
@timsaucer
Copy link
Member Author

timsaucer commented May 16, 2025

@mesejo @kosiew @chenkovsky @kevinjqliu Would any of you mind doing a review? We haven't released datafusion-python in a while and this is one of the PRs waiting to go in. 1/3

@kosiew
Copy link
Contributor

kosiew commented May 16, 2025

Looks good to me.

Considering there is a breaking change, I think it would be a good idea to also add a follow up task to add a migration guide in the ../dev/changelog/.. for the new release.

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I want to double check that this is fine since this is the udf function
https://grep.app/search?q=from+datafusion+import+udf

@timsaucer timsaucer merged commit e8aa671 into apache:main May 17, 2025
17 checks passed
@timsaucer timsaucer deleted the feat/udf-organization branch May 17, 2025 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants