This is an experimental Swift library to show how to connect to a remote Apache Spark Connect Server and run SQL statements to manipulate remote data.
So far, this library project is tracking the upstream changes of Apache Arrow project's Swift-support.
- Apache Spark 4.1.0 (December 2025)
- Swift 6.2 (September 2025)
- gRPC Swift 2.2 (November 2025)
- gRPC Swift Protobuf 2.1.2 (December 2025)
- gRPC Swift NIO Transport 2.4 (December 2025)
- FlatBuffers v25.9.23 (September 2025)
- Apache Arrow Swift
Create a Swift project.
mkdir SparkConnectSwiftApp
cd SparkConnectSwiftApp
swift package init --name SparkConnectSwiftApp --type executableAdd SparkConnect package to the dependency like the following
$ cat Package.swift
import PackageDescription
let package = Package(
name: "SparkConnectSwiftApp",
platforms: [
.macOS(.v15)
],
dependencies: [
.package(url: "https://github.com/apache/spark-connect-swift.git", branch: "main")
],
targets: [
.executableTarget(
name: "SparkConnectSwiftApp",
dependencies: [.product(name: "SparkConnect", package: "spark-connect-swift")]
)
]
)Use SparkSession of SparkConnect module in Swift.
$ cat Sources/main.swift
import SparkConnect
let spark = try await SparkSession.builder.getOrCreate()
print("Connected to Apache Spark \(await spark.version) Server")
let statements = [
"DROP TABLE IF EXISTS t",
"CREATE TABLE IF NOT EXISTS t(a INT) USING ORC",
"INSERT INTO t VALUES (1), (2), (3)",
]
for s in statements {
print("EXECUTE: \(s)")
_ = try await spark.sql(s).count()
}
print("SELECT * FROM t")
try await spark.sql("SELECT * FROM t").cache().show()
try await spark.range(10).filter("id % 2 == 0").write.mode("overwrite").orc("/tmp/orc")
try await spark.read.orc("/tmp/orc").show()
await spark.stop()Run your Swift application.
$ swift run
...
Connected to Apache Spark 4.1.0 Server
EXECUTE: DROP TABLE IF EXISTS t
EXECUTE: CREATE TABLE IF NOT EXISTS t(a INT) USING ORC
EXECUTE: INSERT INTO t VALUES (1), (2), (3)
SELECT * FROM t
+---+
| a|
+---+
| 1|
| 3|
| 2|
+---+
+---+
| id|
+---+
| 6|
| 8|
| 4|
| 2|
| 0|
+---+You can find more complete examples including Spark SQL REPL, Web Server and Streaming applications in the Examples directory.
This library also supports SPARK_REMOTE environment variable to specify the Spark Connect connection string in order to provide more options.