-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Feat](udf) Support Python UDF/UDAF/UDTF for Doris #59543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
fba1d68 to
58258db
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
5ca8e8d to
511c8ad
Compare
Co-authored-by: yangshijie <[email protected]>
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 32292 ms |
TPC-DS: Total hot run time: 172159 ms |
This PR references ByteDance's implementation. Co-Author: @WencongLiu
Description
This PR introduces native support for Python User-Defined Functions (UDF), User-Defined Aggregate Functions (UDAF), and User-Defined Table Functions (UDTF) in Doris, enabling users to extend SQL capabilities with custom Python logic for complex data processing scenarios.
Key Features
🚀 Three Function Types
🔧 Production-Grade Architecture
🎯 Deep Integration
Architecture Highlights
Configuration
Add to
be.conf:Technical Highlights
1. Environment Management
2. Process Pool Management
3. Communication Protocol
4. Execution Modes
5. UDAF State Management (Snowflake Style)
__init__,accumulate,merge,finish,aggregate_stateLimitations
max_python_process_numsetting.Related Documentation
This PR enables users to leverage the rich Python ecosystem (NumPy, Pandas, scikit-learn, etc.) directly within Doris SQL queries, significantly expanding the platform's data processing capabilities.
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)