An Arrow Flight SQL Server with DuckDB back-end execution engines
Flight Sql Server/ Http Sql Server with DuckDB backend lets you run DuckDB remotely and let multiple user connect to it remotely with flight jdbc driver or over Http using arrow extension.
It support all the clients including JDBC, ADBC Python flight sql driver as well as sqlflite_client CLI tool
Requirement JDK 17 or 21
- Export maven options
export MAVEN_OPTS="--add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED" - Start the client which will display the result
./mvnw clean compile exec:java -Dexec.mainClass="io.github.tanejagagan.http.sql.server.Main"
- INSTALL arrow FROM community;
- LOAD arrow;
- select * from read_arrow(concat('http://localhost:8080/query?q=', url_encode('select 1, 2, 3')));
- Build the docker image with
./mvnw clean package -DskipTests jib:dockerBuild - Start the container with
example/datamounted to the containerdocker run -ti -v "$PWD/example/data":/data -p 55559:55559 flight-sql-duckdb
Download the Apache Arrow Flight SQL JDBC driver
- Database and schema specified as part of connection url. Passed to server as header database and schema.
- Fetch size can be specified. It's passed to the server in header fetch_size.
- Bulk write to parquet file using bulk upload functionality. Idea is to to bulk upload and then add those files to metadata.
- Username and Passwords can be specified in application.conf file.
You can then use the JDBC driver to connect from your host computer to the locally running Docker Flight SQL server with this JDBC string (change the password value to match the value specified for the SQLFLITE_PASSWORD environment variable if you changed it from the example above):
jdbc:arrow-flight-sql://localhost:55559??database=memory&useEncryption=0&user=admin&password=adminFor instructions on setting up the JDBC driver in popular Database IDE tool: DBeaver Community Edition - see this repo.
Note - if you stop/restart the Flight SQL Docker container, and attempt to connect via JDBC with the same password - you could get error: "Invalid bearer token provided. Detail: Unauthenticated". This is because the client JDBC driver caches the bearer token signed with the previous instance's secret key. Just change the password in the new container by changing the "SQLFLITE_PASSWORD" env var setting - and then use that to connect via JDBC.
Connecting to the server via the new ADBC Python Flight SQL driver
You can now use the new Apache Arrow Python ADBC Flight SQL driver to query the Flight SQL server. ADBC offers performance advantages over JDBC - because it minimizes serialization/deserialization, and data stays in columnar format at all phases.
You can learn more about ADBC and Flight SQL here.
Ensure you have Python 3.9+ installed, then open a terminal, then run:
# Create a Python virtual environment
python3 -m venv .venv
# Activate the virtual environment
. .venv/bin/activate
# Install the requirements including the new Arrow ADBC Flight SQL driver
pip install --upgrade pip
pip install pandas pyarrow adbc_driver_flightsql
# Start the python interactive shell
pythonIn the Python shell - you can then run:
import os
from adbc_driver_flightsql import dbapi as sqlflite, DatabaseOptions
with sqlflite.connect(uri="grpc+tls://localhost:55559",
db_kwargs={"username": os.getenv("SQLFLITE_USERNAME", "admin"),
"password": os.getenv("SQLFLITE_PASSWORD", "admin"),
DatabaseOptions.TLS_SKIP_VERIFY.value: "true" # Not needed if you use a trusted CA-signed TLS cert
}
) as conn:
with conn.cursor() as cur:
cur.execute("select * from generate_series(20)",
)
x = cur.fetch_arrow_table()
print(x)You should see results:
Connecting via Ibis
See: https://github.com/ibis-project/ibis-sqlflite
Connecting via SQLAlchemy
See: https://github.com/prmoore77/sqlalchemy-sqlflite-adbc-dialect