Skip to content

Commit 42bbc1e

Browse files
authored
Bump SLF4J from 1.7.30 to 2.0.16. (#33574)
* Update slf4j version * Remove slf4j from arrow dependency exclusion. * Fix two test failures due to upgrade slf4j-jdk14 to 2.x The class path of slf4j has been changed from org/slf4j/impl to org/slf4j/jul. Two failed tests: - :runners:google-cloud-dataflow-java:worker:validateShadedJarContainsSlf4jJdk14 - :runners:google-cloud-dataflow-java:worker:validateShadedJarDoesntLeakNonProjectClasses * Fixed another four failed tests. The failed tests are under org.apache.beam.runners.dataflow.worker.HotKeyLoggerTest * Bump the default spark version from 3.2.2 to 3.5.0. The previous version has a compile dependency on slf4j 1.x binding, which would no longer work with slf4j 2.x. * Add used but not declared deps for spark 3.5.0 * Temporary modify spark version to 3.x in sparkreceiver. * Fix failed spark tests. * A better workaround for Spark 3.2.x * Take out the add-opens for tests as they were only run in java 8 and 11. * Mention changes in CHANGES.md * Update comments * Move sparkReceiver/2 to sparkreceiver/3 that supports Spark 3.x. * Minor fix on cdap spark dependency
1 parent 077589a commit 42bbc1e

29 files changed

+62
-41
lines changed

.github/workflows/beam_PerformanceTests_SparkReceiver_IO.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ jobs:
9696
- name: run integrationTest
9797
uses: ./.github/actions/gradle-command-self-hosted-action
9898
with:
99-
gradle-command: :sdks:java:io:sparkreceiver:2:integrationTest
99+
gradle-command: :sdks:java:io:sparkreceiver:3:integrationTest
100100
arguments: |
101101
--info \
102102
--tests org.apache.beam.sdk.io.sparkreceiver.SparkReceiverIOIT \

CHANGES.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,12 +68,13 @@
6868
## New Features / Improvements
6969

7070
* Support custom coders in Reshuffle ([#29908](https://github.com/apache/beam/issues/29908), [#33356](https://github.com/apache/beam/issues/33356)).
71-
71+
* [Java] Upgrade SLF4J to 2.0.16. Update default Spark version to 3.5.0. ([#33574](https://github.com/apache/beam/pull/33574))
7272
* X feature added (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).
7373

7474
## Breaking Changes
75-
* [Python] Reshuffle now correctly respects user-specified type hints, fixing a previous bug where it might use FastPrimitivesCoder wrongly. This change could break pipelines with incorrect type hints in Reshuffle. If you have issues after upgrading, temporarily set update_compatibility_version to a previous Beam version to use the old behavior. The recommended solution is to fix the type hints in your code. ([#33932](https://github.com/apache/beam/pull/33932))
7675

76+
* [Python] Reshuffle now correctly respects user-specified type hints, fixing a previous bug where it might use FastPrimitivesCoder wrongly. This change could break pipelines with incorrect type hints in Reshuffle. If you have issues after upgrading, temporarily set update_compatibility_version to a previous Beam version to use the old behavior. The recommended solution is to fix the type hints in your code. ([#33932](https://github.com/apache/beam/pull/33932))
77+
* [Java] SparkReceiver 2 has been moved to SparkReceiver 3 that supports Spark 3.x. ([#33574](https://github.com/apache/beam/pull/33574))
7778
* X behavior was changed ([#X](https://github.com/apache/beam/issues/X)).
7879

7980
## Deprecations

build.gradle.kts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,7 +304,7 @@ tasks.register("javaPreCommit") {
304304
dependsOn(":sdks:java:io:contextualtextio:build")
305305
dependsOn(":sdks:java:io:expansion-service:build")
306306
dependsOn(":sdks:java:io:file-based-io-tests:build")
307-
dependsOn(":sdks:java:io:sparkreceiver:2:build")
307+
dependsOn(":sdks:java:io:sparkreceiver:3:build")
308308
dependsOn(":sdks:java:io:synthetic:build")
309309
dependsOn(":sdks:java:io:xml:build")
310310
dependsOn(":sdks:java:javadoc:allJavadoc")

buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -636,12 +636,12 @@ class BeamModulePlugin implements Plugin<Project> {
636636
def quickcheck_version = "1.0"
637637
def sbe_tool_version = "1.25.1"
638638
def singlestore_jdbc_version = "1.1.4"
639-
def slf4j_version = "1.7.30"
639+
def slf4j_version = "2.0.16"
640640
def snakeyaml_engine_version = "2.6"
641641
def snakeyaml_version = "2.2"
642642
def solace_version = "10.21.0"
643643
def spark2_version = "2.4.8"
644-
def spark3_version = "3.2.2"
644+
def spark3_version = "3.5.0"
645645
def spotbugs_version = "4.0.6"
646646
def testcontainers_version = "1.19.7"
647647
// [bomupgrader] determined by: org.apache.arrow:arrow-memory-core, consistent with: google_cloud_platform_libraries_bom

runners/google-cloud-dataflow-java/worker/build.gradle

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ applyJavaNature(
8787
// TODO(https://github.com/apache/beam/issues/19114): Move DataflowRunnerHarness class under org.apache.beam.runners.dataflow.worker namespace
8888
"com/google/cloud/dataflow/worker/DataflowRunnerHarness.class",
8989
// Allow slf4j implementation worker for logging during pipeline execution
90-
"org/slf4j/impl/**"
90+
"org/slf4j/jul/**"
9191
],
9292
generatedClassPatterns: [
9393
/^org\.apache\.beam\.runners\.dataflow\.worker\.windmill.*/,
@@ -240,7 +240,7 @@ project.task('validateShadedJarContainsSlf4jJdk14', dependsOn: 'shadowJar') {
240240
doLast {
241241
project.configurations.shadow.artifacts.files.each {
242242
FileTree slf4jImpl = project.zipTree(it).matching {
243-
include "org/slf4j/impl/JDK14LoggerAdapter.class"
243+
include "org/slf4j/jul/JDK14LoggerAdapter.class"
244244
}
245245
outFile.text = slf4jImpl.files
246246
if (slf4jImpl.files.isEmpty()) {

runners/spark/3/build.gradle

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,26 @@ sparkVersions.each { kv ->
5656
}
5757

5858
dependencies {
59+
// Spark versions prior to 3.4.0 are compiled against SLF4J 1.x. The
60+
// `org.apache.spark.internal.Logging.isLog4j12()` function references an
61+
// SLF4J 1.x binding class (org.slf4j.impl.StaticLoggerBinder) which is
62+
// no longer available in SLF4J 2.x. This results in a
63+
// `java.lang.NoClassDefFoundError`.
64+
//
65+
// The workaround is to provide an SLF4J 1.x binding module out of group
66+
// `org.slf4j` to resolve the issue.
67+
// Module `org.apache.logging.log4j:log4j-slf4j-impl` is an example that
68+
// provides a compatible SLF4J 1.x binding regardless SLF4J upgrade.
69+
// Binding/provider modules under group `org.slf4j` (e.g.,
70+
// slf4j-simple, slf4j-reload4j) get upgraded as a new SLF4J version is in
71+
// use, and therefore do not contain the 1.x binding classes.
72+
//
73+
// Notice that Spark 3.1.x uses `ch.qos.logback:logback-classic` and is
74+
// unaffected by the SLF4J upgrade. Spark 3.3.x already uses
75+
// `log4j-slf4j-impl` so it is also unaffected.
76+
if ("$kv.key" >= "320" && "$kv.key" <= "324") {
77+
"sparkVersion$kv.key" library.java.log4j2_slf4j_impl
78+
}
5979
spark.components.each { component -> "sparkVersion$kv.key" "$component:$kv.value" }
6080
}
6181

runners/spark/spark_runner.gradle

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,10 @@ dependencies {
176176
spark.components.each { component ->
177177
provided "$component:$spark_version"
178178
}
179+
if ("$spark_version" >= "3.5.0") {
180+
implementation "org.apache.spark:spark-common-utils_$spark_scala_version:$spark_version"
181+
implementation "org.apache.spark:spark-sql-api_$spark_scala_version:$spark_version"
182+
}
179183
permitUnusedDeclared "org.apache.spark:spark-network-common_$spark_scala_version:$spark_version"
180184
implementation "io.dropwizard.metrics:metrics-core:4.1.1" // version used by Spark 3.1
181185
compileOnly "org.scala-lang:scala-library:2.12.15"
@@ -202,6 +206,10 @@ dependencies {
202206
testImplementation library.java.mockito_core
203207
testImplementation "org.assertj:assertj-core:3.11.1"
204208
testImplementation "org.apache.zookeeper:zookeeper:3.4.11"
209+
if ("$spark_version" >= "3.5.0") {
210+
testImplementation "org.apache.spark:spark-common-utils_$spark_scala_version:$spark_version"
211+
testImplementation "org.apache.spark:spark-sql-api_$spark_scala_version:$spark_version"
212+
}
205213
validatesRunner project(path: ":sdks:java:core", configuration: "shadowTest")
206214
validatesRunner project(path: ":runners:core-java", configuration: "testRuntimeMigration")
207215
validatesRunner project(":sdks:java:io:hadoop-format")

sdks/java/extensions/arrow/build.gradle

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,10 @@ description = "Apache Beam :: SDKs :: Java :: Extensions :: Arrow"
2424
dependencies {
2525
implementation library.java.vendored_guava_32_1_2_jre
2626
implementation project(path: ":sdks:java:core", configuration: "shadow")
27-
implementation(library.java.arrow_vector) {
28-
// Arrow 15 has compile dependency of slf4j 2.x where Beam does not support
29-
exclude group: 'org.slf4j', module: 'slf4j-api'
30-
}
31-
implementation(library.java.arrow_memory_core) {
32-
// Arrow 15 has compile dependency of slf4j 2.x where Beam does not support
33-
exclude group: 'org.slf4j', module: 'slf4j-api'
34-
}
27+
implementation(library.java.arrow_vector)
28+
implementation(library.java.arrow_memory_core)
3529
implementation library.java.joda_time
36-
testImplementation(library.java.arrow_memory_netty) {
37-
// Arrow 15 has compile dependency of slf4j 2.x where Beam does not support
38-
exclude group: 'org.slf4j', module: 'slf4j-api'
39-
}
30+
testImplementation(library.java.arrow_memory_netty)
4031
testImplementation library.java.junit
4132
testImplementation library.java.hamcrest
4233
testRuntimeOnly library.java.slf4j_simple

sdks/java/io/cdap/build.gradle

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,11 @@ dependencies {
4545
implementation library.java.cdap_etl_api
4646
implementation library.java.cdap_etl_api_spark
4747
implementation library.java.cdap_hydrator_common
48-
implementation library.java.cdap_plugin_hubspot
48+
implementation (library.java.cdap_plugin_hubspot) {
49+
// Excluding the module for scala 2.11, because Spark 3.x uses scala
50+
// 2.12 instead.
51+
exclude group: "com.fasterxml.jackson.module", module: "jackson-module-scala_2.11"
52+
}
4953
implementation library.java.cdap_plugin_salesforce
5054
implementation library.java.cdap_plugin_service_now
5155
implementation library.java.cdap_plugin_zendesk
@@ -56,11 +60,17 @@ dependencies {
5660
implementation library.java.jackson_core
5761
implementation library.java.jackson_databind
5862
implementation library.java.slf4j_api
59-
implementation library.java.spark_streaming
63+
implementation (library.java.spark3_streaming) {
64+
// Excluding `org.slf4j:jul-to-slf4j` which was introduced as a
65+
// transitive dependency in Spark 3.5.0 (particularly from
66+
// spark-common-utils_2.12) and would cause stack overflow together with
67+
// `org.slf4j:slf4j-jdk14`.
68+
exclude group: "org.slf4j", module: "jul-to-slf4j"
69+
}
6070
implementation library.java.tephra
6171
implementation library.java.vendored_guava_32_1_2_jre
6272
implementation project(path: ":sdks:java:core", configuration: "shadow")
63-
implementation project(":sdks:java:io:sparkreceiver:2")
73+
implementation project(":sdks:java:io:sparkreceiver:3")
6474
implementation project(":sdks:java:io:hadoop-format")
6575
testImplementation library.java.cdap_plugin_service_now
6676
testImplementation library.java.cdap_etl_api

sdks/java/io/google-cloud-platform/build.gradle

Lines changed: 3 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -138,22 +138,13 @@ dependencies {
138138
implementation library.java.slf4j_api
139139
implementation library.java.vendored_grpc_1_69_0
140140
implementation library.java.vendored_guava_32_1_2_jre
141-
implementation(library.java.arrow_memory_core) {
142-
// Arrow 15 has compile dependency of slf4j 2.x where Beam does not support
143-
exclude group: 'org.slf4j', module: 'slf4j-api'
144-
}
145-
implementation(library.java.arrow_vector) {
146-
// Arrow 15 has compile dependency of slf4j 2.x where Beam does not support
147-
exclude group: 'org.slf4j', module: 'slf4j-api'
148-
}
141+
implementation library.java.arrow_memory_core
142+
implementation library.java.arrow_vector
149143

150144
implementation 'com.google.http-client:google-http-client-gson:1.41.2'
151145
implementation "org.threeten:threetenbp:1.4.4"
152146

153-
testImplementation(library.java.arrow_memory_netty) {
154-
// Arrow 15 has compile dependency of slf4j 2.x where Beam does not support
155-
exclude group: 'org.slf4j', module: 'slf4j-api'
156-
}
147+
testImplementation library.java.arrow_memory_netty
157148
testImplementation project(path: ":sdks:java:core", configuration: "shadowTest")
158149
testImplementation project(path: ":sdks:java:extensions:avro", configuration: "testRuntimeMigration")
159150
testImplementation project(path: ":sdks:java:extensions:google-cloud-platform-core", configuration: "testRuntimeMigration")

0 commit comments

Comments
 (0)