Skip to content

Commit b4edafa

Browse files
jsnowackiHyukjinKwon
authored andcommitted
[SPARK-22495] Fix setup of SPARK_HOME variable on Windows
## What changes were proposed in this pull request? Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous version was working with the built release download, the set of directories changed slightly for the PySpark `pip` or `conda` install. This has been reflected in Linux files in `bin` but not for Windows `cmd` files. First fix improves the way how the `jars` directory is found, as this was stoping Windows version of `pip/conda` install from working; JARs were not found by on Session/Context setup. Second fix is adding `find-spark-home.cmd` script, which uses `find_spark_home.py` script, as the Linux version, to resolve `SPARK_HOME`. It is based on `find-spark-home` bash script, though, some operations are done in different order due to the `cmd` script language limitations. If environment variable is set, the Python script `find_spark_home.py` will not be run. The process can fail if Python is not installed, but it will mostly use this way if PySpark is installed via `pip/conda`, thus, there is some Python in the system. ## How was this patch tested? Tested on local installation. Author: Jakub Nowacki <[email protected]> Closes #19370 from jsnowacki/fix_spark_cmds.
1 parent 1edb317 commit b4edafa

File tree

7 files changed

+70
-5
lines changed

7 files changed

+70
-5
lines changed

appveyor.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ only_commits:
3333
- core/src/main/scala/org/apache/spark/api/r/
3434
- mllib/src/main/scala/org/apache/spark/ml/r/
3535
- core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
36+
- bin/*.cmd
3637

3738
cache:
3839
- C:\Users\appveyor\.m2

bin/find-spark-home.cmd

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
@echo off
2+
3+
rem
4+
rem Licensed to the Apache Software Foundation (ASF) under one or more
5+
rem contributor license agreements. See the NOTICE file distributed with
6+
rem this work for additional information regarding copyright ownership.
7+
rem The ASF licenses this file to You under the Apache License, Version 2.0
8+
rem (the "License"); you may not use this file except in compliance with
9+
rem the License. You may obtain a copy of the License at
10+
rem
11+
rem http://www.apache.org/licenses/LICENSE-2.0
12+
rem
13+
rem Unless required by applicable law or agreed to in writing, software
14+
rem distributed under the License is distributed on an "AS IS" BASIS,
15+
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
rem See the License for the specific language governing permissions and
17+
rem limitations under the License.
18+
rem
19+
20+
rem Path to Python script finding SPARK_HOME
21+
set FIND_SPARK_HOME_PYTHON_SCRIPT=%~dp0find_spark_home.py
22+
23+
rem Default to standard python interpreter unless told otherwise
24+
set PYTHON_RUNNER=python
25+
rem If PYSPARK_DRIVER_PYTHON is set, it overwrites the python version
26+
if not "x%PYSPARK_DRIVER_PYTHON%"=="x" (
27+
set PYTHON_RUNNER=%PYSPARK_DRIVER_PYTHON%
28+
)
29+
rem If PYSPARK_PYTHON is set, it overwrites the python version
30+
if not "x%PYSPARK_PYTHON%"=="x" (
31+
set PYTHON_RUNNER=%PYSPARK_PYTHON%
32+
)
33+
34+
rem If there is python installed, trying to use the root dir as SPARK_HOME
35+
where %PYTHON_RUNNER% > nul 2>$1
36+
if %ERRORLEVEL% neq 0 (
37+
if not exist %PYTHON_RUNNER% (
38+
if "x%SPARK_HOME%"=="x" (
39+
echo Missing Python executable '%PYTHON_RUNNER%', defaulting to '%~dp0..' for SPARK_HOME ^
40+
environment variable. Please install Python or specify the correct Python executable in ^
41+
PYSPARK_DRIVER_PYTHON or PYSPARK_PYTHON environment variable to detect SPARK_HOME safely.
42+
set SPARK_HOME=%~dp0..
43+
)
44+
)
45+
)
46+
47+
rem Only attempt to find SPARK_HOME if it is not set.
48+
if "x%SPARK_HOME%"=="x" (
49+
if not exist "%FIND_SPARK_HOME_PYTHON_SCRIPT%" (
50+
rem If we are not in the same directory as find_spark_home.py we are not pip installed so we don't
51+
rem need to search the different Python directories for a Spark installation.
52+
rem Note only that, if the user has pip installed PySpark but is directly calling pyspark-shell or
53+
rem spark-submit in another directory we want to use that version of PySpark rather than the
54+
rem pip installed version of PySpark.
55+
set SPARK_HOME=%~dp0..
56+
) else (
57+
rem We are pip installed, use the Python script to resolve a reasonable SPARK_HOME
58+
for /f "delims=" %%i in ('%PYTHON_RUNNER% %FIND_SPARK_HOME_PYTHON_SCRIPT%') do set SPARK_HOME=%%i
59+
)
60+
)

bin/pyspark2.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ rem limitations under the License.
1818
rem
1919

2020
rem Figure out where the Spark framework is installed
21-
set SPARK_HOME=%~dp0..
21+
call "%~dp0find-spark-home.cmd"
2222

2323
call "%SPARK_HOME%\bin\load-spark-env.cmd"
2424
set _SPARK_CMD_USAGE=Usage: bin\pyspark.cmd [options]

bin/run-example.cmd

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@ rem See the License for the specific language governing permissions and
1717
rem limitations under the License.
1818
rem
1919

20-
set SPARK_HOME=%~dp0..
20+
rem Figure out where the Spark framework is installed
21+
call "%~dp0find-spark-home.cmd"
22+
2123
set _SPARK_CMD_USAGE=Usage: ./bin/run-example [options] example-class [example args]
2224

2325
rem The outermost quotes are used to prevent Windows command line parse error

bin/spark-class2.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ rem limitations under the License.
1818
rem
1919

2020
rem Figure out where the Spark framework is installed
21-
set SPARK_HOME=%~dp0..
21+
call "%~dp0find-spark-home.cmd"
2222

2323
call "%SPARK_HOME%\bin\load-spark-env.cmd"
2424

bin/spark-shell2.cmd

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,9 @@ rem See the License for the specific language governing permissions and
1717
rem limitations under the License.
1818
rem
1919

20-
set SPARK_HOME=%~dp0..
20+
rem Figure out where the Spark framework is installed
21+
call "%~dp0find-spark-home.cmd"
22+
2123
set _SPARK_CMD_USAGE=Usage: .\bin\spark-shell.cmd [options]
2224

2325
rem SPARK-4161: scala does not assume use of the java classpath,

bin/sparkR2.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ rem limitations under the License.
1818
rem
1919

2020
rem Figure out where the Spark framework is installed
21-
set SPARK_HOME=%~dp0..
21+
call "%~dp0find-spark-home.cmd"
2222

2323
call "%SPARK_HOME%\bin\load-spark-env.cmd"
2424

0 commit comments

Comments
 (0)