A single shaded (fat) JAR that bundles the Neo4j JDBC driver, the SQL-to-Cypher translator, and the Spark subquery cleaner for use with Databricks Unity Catalog federated queries.
Instead of downloading and uploading two separate JARs (neo4j-jdbc-full-bundle + neo4j-jdbc-translator-sparkcleaner), users upload this single JAR to a UC Volume and reference one path in their connection configuration.
For examples of how to set up and use this connector with Databricks Unity Catalog, see neo4j-uc-integration.
- Java 17+
./mvnw clean verifyThe shaded JAR is produced at:
target/neo4j-unity-catalog-connector-1.0.0-SNAPSHOT.jar
Tests verify that the bundled translators are discoverable via SPI, the Spark subquery cleaner handles Databricks/Spark query patterns, and the JDBC driver class is loadable.
./mvnw testThe release GitHub Actions workflow publishes a GitHub Release with the built JAR when you push a tag:
git tag 1.0.0
git push origin 1.0.0The shaded JAR bundles:
| Dependency | Purpose |
|---|---|
neo4j-jdbc |
Core JDBC driver for Neo4j |
neo4j-jdbc-translator-impl |
SQL-to-Cypher translation engine |
neo4j-jdbc-translator-sparkcleaner |
Cleans Spark subquery wrapping (SPARK_GEN_SUBQ_0 WHERE 1=0) |
All transitive dependencies (Jackson, Netty, jOOQ, Bolt protocol, Cypher DSL, Reactive Streams) are relocated under org.neo4j.jdbc.internal.shaded.* to avoid classpath conflicts with the Databricks runtime.
Connecting Neo4j to Databricks Unity Catalog previously required users to download and upload two separate JARs to a Unity Catalog Volume:
neo4j-jdbc-full-bundle-6.x.x.jar— the main JDBC driver with SQL-to-Cypher translationneo4j-jdbc-translator-sparkcleaner-6.x.x.jar— handles Spark's subquery wrapping (SPARK_GEN_SUBQ_0 WHERE 1=0)
Both had to be referenced individually in the java_dependencies array when creating a UC JDBC connection. This meant two manual downloads from Maven Central, two uploads to a Volume, two paths to manage, and two version numbers to keep in sync. If a user forgot the sparkcleaner JAR or used mismatched versions, the connection silently broke with confusing errors.
This project uses maven-shade-plugin to merge all three dependencies (neo4j-jdbc, neo4j-jdbc-translator-impl, neo4j-jdbc-translator-sparkcleaner) into a single self-contained JAR. Users upload one file to a UC Volume and reference one path — no version coordination, no missing dependencies.
Java's Service Provider Interface (SPI) is a plugin mechanism built into the JDK. A library declares an interface (in this case, TranslatorFactory) and other JARs provide implementations of that interface. At runtime, java.util.ServiceLoader discovers these implementations automatically by reading text files under META-INF/services/ inside the JAR. Each file is named after the interface and lists the fully qualified class names of the implementations.
The Neo4j JDBC driver uses SPI to discover SQL translators. When the driver starts, it calls ServiceLoader.load(TranslatorFactory.class) which scans the classpath for META-INF/services/org.neo4j.jdbc.translator.spi.TranslatorFactory files. Any translator factory listed in those files gets loaded and used in the translation pipeline.
This project bundles two translator JARs that each provide their own SPI registration:
neo4j-jdbc-translator-implregistersSqlToCypherTranslatorFactory— converts SQL to Cypherneo4j-jdbc-translator-sparkcleanerregistersSparkSubqueryCleaningTranslatorFactory— strips Spark'sSPARK_GEN_SUBQ_0 WHERE 1=0wrapping before translation
When the maven-shade-plugin merges these JARs into one, it uses ServicesResourceTransformer to concatenate the separate SPI files into a single merged file. Without this transformer, one file would overwrite the other and only one translator would be discovered at runtime.
If a custom DatabricksTranslator is needed in the future to handle Databricks-specific SQL patterns beyond what the Spark cleaner covers, it can be added by implementing TranslatorFactory and registering it via the same SPI mechanism.
The project includes a META-INF/neo4j-jdbc-user-agent.txt file containing:
neo4j-unity-catalog-connector/${project.version}
This string is sent by the Neo4j JDBC driver to the Neo4j server with every connection. The ${project.version} placeholder is substituted by Maven at build time (via <filtering>true</filtering> in the pom.xml). This lets Neo4j (especially Aura) distinguish connections coming from the Databricks UC connector vs the plain JDBC driver — useful for support, usage analytics, and debugging.
All bundled dependencies are relocated under org.neo4j.jdbc.internal.shaded.* to avoid classpath conflicts with whatever JARs are already on the Databricks SafeSpark sandbox classpath.