Skip to content

Commit c393419

Browse files
pan3793dongjoon-hyun
authored andcommitted
[SPARK-53125][TEST] RemoteSparkSession prints whole spark-submit command
### What changes were proposed in this pull request? Make RemoteSparkSession print the whole `spark-submit` command in debug mode, which helps the developers to understand the SPARK_HOME, classpath, log output, etc. of the testing connect server process. ### Why are the changes needed? Improve the debug message. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` export SPARK_DEBUG_SC_JVM_CLIENT=true ``` ``` sbt:spark-connect-client-jvm> testOnly *ClientE2ETestSuite -- -z "throw SparkException with large cause exception" [info] ClientE2ETestSuite: Starting the Spark Connect Server... Using jar: /Users/chengpan/Projects/apache-spark/sql/connect/server/target/scala-2.13/spark-connect-assembly-4.1.0-SNAPSHOT.jar Using jar: /Users/chengpan/Projects/apache-spark/sql/catalyst/target/scala-2.13/spark-catalyst_2.13-4.1.0-SNAPSHOT-tests.jar /Users/chengpan/Projects/apache-spark/bin/spark-submit \ --driver-class-path /Users/chengpan/Projects/apache-spark/sql/connect/server/target/scala-2.13/spark-connect-assembly-4.1.0-SNAPSHOT.jar \ --class org.apache.spark.sql.connect.SimpleSparkConnectService \ --jars /Users/chengpan/Projects/apache-spark/sql/catalyst/target/scala-2.13/spark-catalyst_2.13-4.1.0-SNAPSHOT-tests.jar \ --conf spark.connect.grpc.binding.port=15707 \ --conf spark.sql.catalog.testcat=org.apache.spark.sql.connector.catalog.InMemoryTableCatalog \ --conf spark.sql.catalogImplementation=hive \ --conf spark.connect.execute.reattachable.senderMaxStreamDuration=1s \ --conf spark.connect.execute.reattachable.senderMaxStreamSize=123 \ --conf spark.connect.grpc.arrow.maxBatchSize=10485760 \ --conf spark.ui.enabled=false \ --conf spark.driver.extraJavaOptions=-Dlog4j.configurationFile=/Users/chengpan/Projects/apache-spark/sql/connect/client/jvm/src/test/resources/log4j2.properties /Users/chengpan/Projects/apache-spark/sql/connect/server/target/scala-2.13/spark-connect-assembly-4.1.0-SNAPSHOT.jar ... ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51846 from pan3793/SPARK-53125. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 4cc0b51 commit c393419

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/test/RemoteSparkSession.scala

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,15 +71,22 @@ object SparkConnectServerUtils {
7171
findJar("sql/catalyst", "spark-catalyst", "spark-catalyst", test = true).getCanonicalPath
7272

7373
val command = Seq.newBuilder[String]
74-
command += "bin/spark-submit"
74+
command += s"$sparkHome/bin/spark-submit"
7575
command += "--driver-class-path" += connectJar
7676
command += "--class" += "org.apache.spark.sql.connect.SimpleSparkConnectService"
7777
command += "--jars" += catalystTestJar
7878
command += "--conf" += s"spark.connect.grpc.binding.port=$port"
7979
command ++= testConfigs
8080
command ++= debugConfigs
8181
command += connectJar
82-
val builder = new ProcessBuilder(command.result(): _*)
82+
val cmds = command.result()
83+
debug {
84+
cmds.reduce[String] {
85+
case (acc, cmd) if cmd startsWith "-" => acc + " \\\n " + cmd
86+
case (acc, cmd) => acc + " " + cmd
87+
}
88+
}
89+
val builder = new ProcessBuilder(cmds: _*)
8390
builder.directory(new File(sparkHome))
8491
val environment = builder.environment()
8592
environment.remove("SPARK_DIST_CLASSPATH")

0 commit comments

Comments
 (0)