Skip to content

Commit 9c97e2d

Browse files
committed
add JBang UDF example
1 parent d0d4914 commit 9c97e2d

File tree

9 files changed

+54
-14
lines changed

9 files changed

+54
-14
lines changed

user-defined-function/README.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,27 @@
11

22
# Custom User-Defined Functions in Flink
3-
This guide will lead you through the process of creating, compiling, and deploying a user-defined functions (UDFs) in Apache Flink using DataSQRL. We will specifically focus on a function called MyScalarFunction, which will double the values of input numbers, and then deploy and execute the function in Flink.
3+
This guide will lead you through the process of creating, compiling, and deploying a user-defined functions (UDFs) in Apache Flink using DataSQRL.
4+
We will specifically focus on a function called `MyScalarFunction`, which will double the values of input numbers, and then deploy and execute the function in Flink.
45

56
## Introduction
6-
User-defined functions (UDFs) in Flink are powerful tools that allow for the extension of the system's built-in functionality. UDFs can be used to perform operations on data that are not covered by the built-in functions.
7+
User-defined functions (UDFs) in Flink are powerful tools that allow for the extension of the system's built-in functionality.
8+
UDFs can be used to perform operations on data that are not covered by the built-in functions.
79

8-
## Creating a User-Defined Function
9-
1. **Project Structure:** The `myjavafunction` folder contains a sample Java project, demonstrating the structure and necessary components of a Flink UDF.
10+
We support two main ways to ship UDFs for SQRL applications:
11+
1. [**JBang**](./jbang): For simple, standalone UDFs that has no or only lightweight dependencies, DataSQRL can build and ship UDFs on the fly via [JBang](https://www.jbang.dev/).
12+
2. [**Assemble JAR Manually**](./maven-project): For more complex UDF projects with common abstract UDF layers, complex logic, a custom Maven or Gradle project can be crated.
13+
Then the manually built JAR(s) can be placed to a project folder, which DataSQRL will recognize and ship.
1014

11-
2. **Defining the Function:** The main component of this project is the MyScalarFunction class. This is the implementation of a custom flink function. DataSQRL recognizes flink functions that extend UserDefinedFunction.
15+
To learn more about each option, please see the example project and their specific readme.
1216

13-
3. **ServiceLoader Entry:** The function must be registered with a ServiceLoader entry. This is essential for DataSQRL to recognize and use your UDF.
14-
- **AutoService Library:** The example includes the AutoService library by Google, simplifying the creation of ServiceLoader META-INF manifest entries.
15-
16-
4. **Jar Compiling:** Compile the sample project and build a jar. This jar is what DataSQRL will use to discover your function. It reads the manifest entries for any UserDefinedFunction classes and load them into DataSQRL for use in queries. It can be placed into any folder relative to the sqrl root folder which will translate to the import path. In the example, we will use the `target` directory that the compilation process creates.
17+
The next sections contain instructions about how to run and interact with the example projects, and those steps are the same for both variants.
1718

1819
## SQRL Compilation and Packaging
1920
1. **SQRL Compilation:** Compile the SQRL using DataSQRL's command interface, which prepares your script for deployment in the Flink environment.
2021

2122
```shell
23+
# "cd jbang" OR "cd maven-project"
24+
2225
docker run --rm -v $PWD:/build datasqrl/cmd:latest compile myudf.sqrl
2326
```
2427

@@ -52,4 +55,4 @@ query {
5255
myFnc
5356
}
5457
}'
55-
```
58+
```
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
## Creating JBang UDFs
2+
3+
1. **Structure:** The `usrlib` contains a sample Java UDF file, but any folder name or subfolder can be used inside the DataSQRL project root.
4+
2. **Defining the Function:** The main component of this project is the `MyScalarFunction` class.
5+
This is the implementation of a custom flink function. DataSQRL recognizes flink functions that extend any base Flink UDF class.
6+
3. **Defining Dependencies:** To fetch dependencies JBang will look for lines that start with `//DEPS` and define a proper artifact.
7+
The bare minimum that any Flink UDF will require is `flink-table-common`, because the custom UDF has to extend a Flink built-in UDF parent class.
8+
4. **JAR Compiling:** The compile and JAR build is handled by JBang, that DataSQRL will do automatically on the fly.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
IMPORT usrlib.MyScalarFunction;
2+
3+
-- Capture some input data
4+
CREATE TABLE InputData (
5+
val BIGINT NOT NULL
6+
);
7+
8+
MyTable := SELECT val, MyScalarFunction(val, val) AS myUdf FROM InputData;
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
//DEPS org.apache.flink:flink-table-common:1.19.3
2+
3+
import org.apache.flink.table.functions.ScalarFunction;
4+
5+
public class MyScalarFunction extends ScalarFunction {
6+
7+
public long eval(long a, long b) {
8+
return a + b;
9+
}
10+
}
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
## Creating a UDF Maven Project
2+
3+
1. **Project Structure:** The `myjavafunction` folder contains a sample Java project, demonstrating the structure and necessary components of a Flink UDF.
4+
2. **Defining the Function:** The main component of this project is the `MyScalarFunction` class.
5+
This is the implementation of a custom flink function. DataSQRL recognizes flink functions that extend any base Flink UDF class.
6+
3. **ServiceLoader Entry:** The function must be registered with a ServiceLoader entry. This is essential for DataSQRL to recognize and use your UDF.
7+
4. **AutoService Library:** The example includes the AutoService library by Google, simplifying the creation of ServiceLoader `META-INF` manifest entries.
8+
5. **JAR Compiling:** Compile the sample project and build a jar. This jar is what DataSQRL will use to discover your function.
9+
It reads the manifest entries for any UserDefinedFunction classes and load them into DataSQRL for use in queries.
10+
It can be placed into any folder relative to the sqrl root folder which will translate to the import path.
11+
In the example, we will use the `target` directory that the compilation process creates.

user-defined-function/myjavafunction/pom.xml renamed to user-defined-function/maven-project/myjavafunction/pom.xml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,12 @@
1818
<artifactId>auto-service</artifactId>
1919
<version>1.0.1</version>
2020
</dependency>
21+
2122
<dependency>
2223
<groupId>org.apache.flink</groupId>
2324
<artifactId>flink-table-api-java-bridge</artifactId>
24-
<version>1.16.1</version>
25+
<version>1.19.3</version>
2526
<scope>provided</scope>
2627
</dependency>
2728
</dependencies>
28-
</project>
29+
</project>

user-defined-function/myjavafunction/src/main/java/com/myudf/MyScalarFunction.java renamed to user-defined-function/maven-project/myjavafunction/src/main/java/com/myudf/MyScalarFunction.java

File renamed without changes.

user-defined-function/myjavafunction/target/myudf-0.1.0-SNAPSHOT.jar renamed to user-defined-function/maven-project/myjavafunction/target/myudf-0.1.0-SNAPSHOT.jar

File renamed without changes.

user-defined-function/myudf.sqrl renamed to user-defined-function/maven-project/myudf.sqrl

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,4 @@ CREATE TABLE InputData (
88
val BIGINT NOT NULL
99
);
1010

11-
MyTable := SELECT val, MyScalarFunction(val, val) AS myFnc
12-
FROM InputData;
11+
MyTable := SELECT val, MyScalarFunction(val, val) AS myUdf FROM InputData;

0 commit comments

Comments
 (0)