Skip to content

Commit dd8b945

Browse files
authored
udf: add UDTF and improve the org of UDF (#2704)
* add udtf * embedded&scalar&table
1 parent f0ec620 commit dd8b945

File tree

8 files changed

+583
-360
lines changed

8 files changed

+583
-360
lines changed

docs/en/guides/54-query/03-udf.md

Lines changed: 76 additions & 329 deletions
Large diffs are not rendered by default.
Lines changed: 79 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,98 @@
11
---
22
title: ALTER FUNCTION
3-
sidebar_position: 2
3+
sidebar_position: 3
44
---
55
import FunctionDescription from '@site/src/components/FunctionDescription';
66

77
<FunctionDescription description="Introduced or updated: v1.2.116"/>
88

9-
Alters a user-defined function.
9+
Alters a user-defined function. Supports all function types: Scalar SQL, Tabular SQL, and Embedded functions.
1010

1111
## Syntax
1212

13+
### For Scalar SQL Functions
1314
```sql
14-
ALTER FUNCTION [ IF NOT EXISTS ] <function_name>
15-
AS (<input_param_names>) -> <lambda_expression>
15+
ALTER FUNCTION [ IF EXISTS ] <function_name>
16+
( [<parameter_list>] )
17+
RETURNS <return_type>
18+
AS $$ <expression> $$
19+
[ DESC='<description>' ]
20+
```
21+
22+
### For Tabular SQL Functions
23+
```sql
24+
ALTER FUNCTION [ IF EXISTS ] <function_name>
25+
( [<parameter_list>] )
26+
RETURNS TABLE ( <column_definition_list> )
27+
AS $$ <sql_statement> $$
28+
[ DESC='<description>' ]
29+
```
30+
31+
### For Embedded Functions
32+
```sql
33+
ALTER FUNCTION [ IF EXISTS ] <function_name>
34+
( [<parameter_list>] )
35+
RETURNS <return_type>
36+
LANGUAGE <language>
37+
[IMPORTS = ('<import_path>', ...)]
38+
[PACKAGES = ('<package_path>', ...)]
39+
HANDLER = '<handler_name>'
40+
AS $$ <function_code> $$
1641
[ DESC='<description>' ]
1742
```
1843

1944
## Examples
2045

46+
### Altering Scalar SQL Function
47+
```sql
48+
-- Create a scalar function
49+
CREATE FUNCTION calculate_tax(income DECIMAL)
50+
RETURNS DECIMAL
51+
AS $$ income * 0.2 $$;
52+
53+
-- Modify the function to use progressive tax rate
54+
ALTER FUNCTION calculate_tax(income DECIMAL)
55+
RETURNS DECIMAL
56+
AS $$
57+
CASE
58+
WHEN income <= 50000 THEN income * 0.15
59+
ELSE income * 0.25
60+
END
61+
$$;
62+
```
63+
64+
### Altering Tabular SQL Function
65+
```sql
66+
-- Create a table function
67+
CREATE FUNCTION get_employees()
68+
RETURNS TABLE (id INT, name VARCHAR(100))
69+
AS $$ SELECT id, name FROM employees $$;
70+
71+
-- Modify to include department and salary
72+
ALTER FUNCTION get_employees()
73+
RETURNS TABLE (id INT, name VARCHAR(100), department VARCHAR(100), salary DECIMAL)
74+
AS $$ SELECT id, name, department, salary FROM employees $$;
75+
```
76+
77+
### Altering Embedded Function
2178
```sql
22-
-- Create a UDF
23-
CREATE FUNCTION a_plus_3 AS (a) -> a+3+3;
79+
-- Create a Python function
80+
CREATE FUNCTION simple_calc(x INT)
81+
RETURNS INT
82+
LANGUAGE python
83+
HANDLER = 'calc'
84+
AS $$
85+
def calc(x):
86+
return x * 2
87+
$$;
2488

25-
-- Modify the lambda expression of the UDF
26-
ALTER FUNCTION a_plus_3 AS (a) -> a+3;
89+
-- Modify to use a different calculation
90+
ALTER FUNCTION simple_calc(x INT)
91+
RETURNS INT
92+
LANGUAGE python
93+
HANDLER = 'calc'
94+
AS $$
95+
def calc(x):
96+
return x * 3 + 1
97+
$$;
2798
```
Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
---
2+
title: CREATE EMBEDDED FUNCTION
3+
sidebar_position: 2
4+
---
5+
import FunctionDescription from '@site/src/components/FunctionDescription';
6+
7+
<FunctionDescription description="Introduced or updated: v1.2.339"/>
8+
9+
Creates an Embedded UDF using programming languages (Python, JavaScript, WASM). Uses the same unified `$$` syntax as SQL functions for consistency.
10+
11+
## Syntax
12+
13+
```sql
14+
CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS ] <function_name>
15+
( [<parameter_list>] )
16+
RETURNS <return_type>
17+
LANGUAGE <language>
18+
[IMPORTS = ('<import_path>', ...)]
19+
[PACKAGES = ('<package_path>', ...)]
20+
HANDLER = '<handler_name>'
21+
AS $$ <function_code> $$
22+
[ DESC='<description>' ]
23+
```
24+
25+
Where:
26+
- `<parameter_list>`: Comma-separated list of parameters with their types (e.g., `x INT, name VARCHAR`)
27+
- `<return_type>`: The data type of the function's return value
28+
- `<language>`: Programming language (`python`, `javascript`, `wasm`)
29+
- `<import_path>`: Stage files to import (e.g., `@s_udf/your_file.zip`)
30+
- `<package_path>`: Packages to install from pypi (Python only)
31+
- `<handler_name>`: Name of the function in the code to call
32+
- `<function_code>`: The implementation code in the specified language
33+
34+
## Supported Languages
35+
36+
| Language | Description | Enterprise Required | Package Support |
37+
|----------|-------------|-------------------|-----------------|
38+
| `python` | Python 3 with standard library | Yes | PyPI packages via PACKAGES |
39+
| `javascript` | Modern JavaScript (ES6+) | No | No |
40+
| `wasm` | WebAssembly (Rust compiled) | No | No |
41+
42+
## Data Type Mappings
43+
44+
### Python
45+
| Databend Type | Python Type |
46+
|--------------|-------------|
47+
| NULL | None |
48+
| BOOLEAN | bool |
49+
| INT | int |
50+
| FLOAT/DOUBLE | float |
51+
| DECIMAL | decimal.Decimal |
52+
| VARCHAR | str |
53+
| BINARY | bytes |
54+
| LIST | list |
55+
| MAP | dict |
56+
| STRUCT | object |
57+
| JSON | dict/list |
58+
59+
### JavaScript
60+
| Databend Type | JavaScript Type |
61+
|--------------|----------------|
62+
| NULL | null |
63+
| BOOLEAN | Boolean |
64+
| INT | Number |
65+
| FLOAT/DOUBLE | Number |
66+
| DECIMAL | BigDecimal |
67+
| VARCHAR | String |
68+
| BINARY | Uint8Array |
69+
| DATE/TIMESTAMP | Date |
70+
| ARRAY | Array |
71+
| MAP | Object |
72+
| STRUCT | Object |
73+
| JSON | Object/Array |
74+
75+
## Access Control Requirements
76+
77+
| Privilege | Object Type | Description |
78+
|:----------|:--------------|:---------------|
79+
| SUPER | Global, Table | Operates a UDF |
80+
81+
To create an embedded function, the user performing the operation or the [current_role](/guides/security/access-control/roles) must have the SUPER [privilege](/guides/security/access-control/privileges).
82+
83+
## Examples
84+
85+
### Python Function
86+
87+
```sql
88+
-- Simple Python function
89+
CREATE FUNCTION calculate_age_py(VARCHAR)
90+
RETURNS INT
91+
LANGUAGE python HANDLER = 'calculate_age'
92+
AS $$
93+
from datetime import datetime
94+
95+
def calculate_age(birth_date_str):
96+
birth_date = datetime.strptime(birth_date_str, '%Y-%m-%d')
97+
today = datetime.now()
98+
age = today.year - birth_date.year
99+
if (today.month, today.day) < (birth_date.month, birth_date.day):
100+
age -= 1
101+
return age
102+
$$;
103+
104+
-- Use the function
105+
SELECT calculate_age_py('1990-05-15') AS age;
106+
```
107+
108+
### JavaScript Function
109+
110+
```sql
111+
-- JavaScript function for age calculation
112+
CREATE FUNCTION calculate_age_js(VARCHAR)
113+
RETURNS INT
114+
LANGUAGE javascript HANDLER = 'calculateAge'
115+
AS $$
116+
export function calculateAge(birthDateStr) {
117+
const birthDate = new Date(birthDateStr);
118+
const today = new Date();
119+
120+
let age = today.getFullYear() - birthDate.getFullYear();
121+
const monthDiff = today.getMonth() - birthDate.getMonth();
122+
123+
if (monthDiff < 0 || (monthDiff === 0 && today.getDate() < birthDate.getDate())) {
124+
age--;
125+
}
126+
127+
return age;
128+
}
129+
$$;
130+
131+
-- Use the function
132+
SELECT calculate_age_js('1990-05-15') AS age;
133+
```
134+
135+
### Python Function with Packages
136+
137+
```sql
138+
CREATE FUNCTION ml_model_score()
139+
RETURNS FLOAT
140+
LANGUAGE python IMPORTS = ('@s1/model.zip') PACKAGES = ('scikit-learn') HANDLER = 'model_score'
141+
AS $$
142+
from sklearn.datasets import load_iris
143+
from sklearn.model_selection import train_test_split
144+
from sklearn.ensemble import RandomForestClassifier
145+
146+
def model_score():
147+
X, y = load_iris(return_X_y=True)
148+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
149+
150+
model = RandomForestClassifier()
151+
model.fit(X_train, y_train)
152+
return model.score(X_test, y_test)
153+
$$;
154+
155+
-- Use the function
156+
SELECT ml_model_score() AS accuracy;
157+
```
158+
159+
### WASM Function
160+
161+
First, create a Rust project and compile to WASM:
162+
163+
```toml
164+
# Cargo.toml
165+
[package]
166+
name = "arrow-udf-example"
167+
version = "0.1.0"
168+
169+
[lib]
170+
crate-type = ["cdylib"]
171+
172+
[dependencies]
173+
arrow-udf = "0.8"
174+
```
175+
176+
```rust
177+
// src/lib.rs
178+
use arrow_udf::function;
179+
180+
#[function("fib(int) -> int")]
181+
fn fib(n: i32) -> i32 {
182+
let (mut a, mut b) = (0, 1);
183+
for _ in 0..n {
184+
let c = a + b;
185+
a = b;
186+
b = c;
187+
}
188+
a
189+
}
190+
```
191+
192+
Build and deploy:
193+
194+
```bash
195+
cargo build --release --target wasm32-wasip1
196+
# Upload to stage
197+
CREATE STAGE s_udf;
198+
PUT fs:///target/wasm32-wasip1/release/arrow_udf_example.wasm @s_udf/;
199+
```
200+
201+
```sql
202+
-- Create WASM function
203+
CREATE FUNCTION fib_wasm(INT)
204+
RETURNS INT
205+
LANGUAGE wasm HANDLER = 'fib'
206+
AS $$@s_udf/arrow_udf_example.wasm$$;
207+
208+
-- Use the function
209+
SELECT fib_wasm(10) AS fibonacci_result;
210+
```
211+
Lines changed: 40 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,28 @@
11
---
2-
title: CREATE FUNCTION
3-
sidebar_position: 1
2+
title: CREATE SCALAR FUNCTION
3+
sidebar_position: 0
44
---
55
import FunctionDescription from '@site/src/components/FunctionDescription';
66

7-
<FunctionDescription description="Introduced or updated: v1.2.339"/>
7+
<FunctionDescription description="Introduced or updated: v1.2.799"/>
88

9-
Creates a user-defined function.
9+
Creates a Scalar SQL UDF using Databend's unified function syntax.
1010

1111
## Syntax
1212

1313
```sql
1414
CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS ] <function_name>
15-
AS ( <input_param_names> ) -> <lambda_expression>
15+
( [<parameter_list>] )
16+
RETURNS <return_type>
17+
AS $$ <expression> $$
1618
[ DESC='<description>' ]
1719
```
1820

21+
Where:
22+
- `<parameter_list>`: Optional comma-separated list of parameters with their types (e.g., `x INT, y FLOAT`)
23+
- `<return_type>`: The data type of the function's return value
24+
- `<expression>`: SQL expression that defines the function logic
25+
1926
## Access control requirements
2027

2128
| Privilege | Object Type | Description |
@@ -26,4 +33,31 @@ To create a user-defined function, the user performing the operation or the [cur
2633

2734
## Examples
2835

29-
See [Usage Examples](/guides/query/udf#usage-examples).
36+
```sql
37+
-- Create a function to calculate area of a circle
38+
CREATE FUNCTION area_of_circle(radius FLOAT)
39+
RETURNS FLOAT
40+
AS $$
41+
pi() * radius * radius
42+
$$;
43+
44+
-- Create a function to calculate age in years
45+
CREATE FUNCTION calculate_age(birth_date DATE)
46+
RETURNS INT
47+
AS $$
48+
date_diff('year', birth_date, now())
49+
$$;
50+
51+
-- Create a function with multiple parameters
52+
CREATE FUNCTION calculate_bmi(weight_kg FLOAT, height_m FLOAT)
53+
RETURNS FLOAT
54+
AS $$
55+
weight_kg / (height_m * height_m)
56+
$$;
57+
58+
-- Use the functions
59+
SELECT area_of_circle(5.0) AS circle_area;
60+
SELECT calculate_age('1990-05-15') AS age;
61+
SELECT calculate_bmi(70.0, 1.75) AS bmi;
62+
```
63+

0 commit comments

Comments
 (0)