You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pyrasterframes/src/main/python/docs/getting-started.pymd
+7-44Lines changed: 7 additions & 44 deletions
Original file line number
Diff line number
Diff line change
@@ -59,30 +59,20 @@ You can also use RasterFrames in the following environments:
59
59
60
60
See [RasterFrames Notebook README](https://github.com/locationtech/rasterframes/blob/develop/rf-notebook/README.md) for instructions on building the Docker image for this Jupyter notebook server.
61
61
62
-
### `pyspark` shell or app
62
+
### `pyspark` shell
63
63
64
-
You can use RasterFrames in a `pyspark` shell or when submitting a `pyspark` app via a Python script. To set up the `pyspark` environment, prepare your call with the appropriate `--master` and other `--conf` arguments for your cluster manager and environment. To these you will add the PyRasterFrames assembly JAR and the python source zip.
64
+
You can use RasterFrames in a `pyspark` shell. To set up the `pyspark` environment, prepare your call with the appropriate `--master` and other `--conf` arguments for your cluster manager and environment. For RasterFrames support you need to pass arguments pointing to the various Java dependencies. You will also need the Python source zip, even if you have pip installed the package. You can download the source zip here: https://repo1.maven.org/maven2/org/locationtech/rasterframes/pyrasterframes_2.11/${VERSION}/pyrasterframes_2.11-${VERSION}-python.zip.
65
65
66
-
You can either [build](https://github.com/locationtech/rasterframes/blob/develop/README.md) the artifacts or download them:
* The assembly JAR is embedded in the wheel file publised on pypi. Download the wheel from https://pypi.org/project/pyrasterframes/#files
71
-
* The wheel file is just a [zip file with .whl extension](https://www.python.org/dev/peps/pep-0427/); you can extract the assembly JAR with a command like this: `unzip -j $PYRF_WHEEL $(zipinfo -1 $PYRF_WHEEL | grep jar)`
72
-
73
-
74
-
#### Shell
75
-
76
-
The `pyspark` shell command will look something like this, replacing the `--jars` argument with the assembly jar and the `--py-files` with the source zip (not the wheel). To submit a script, add a .py file as the final argument
66
+
The `pyspark` shell command will look something like this.
Then in the `pyspark` shell, import the module and call `withRasterFrames` on the SparkSession.
@@ -104,33 +94,6 @@ SparkSession available as 'spark'.
104
94
105
95
Now you have the configured SparkSession with RasterFrames enabled.
106
96
107
-
#### Submitting Apps
108
-
109
-
Prepare the call to `spark-submit` in much the same way as using the `pyspark` shell. In the python script you submit, you will use the SparkSession builder pattern and add some RasterFrames extras to it. You have more flexibility in setting up configurations in either your script or in the `spark-submit` call.
GDAL provides a wide variety of drivers to read data from many different raster formats. If GDAL is installed in the environment, RasterFrames will be able to @ref:[read](raster-read.md) those formats. If you are using the @ref:[Jupyter Notebook image](getting-started.md#jupyter-notebook), GDAL is already installed for you. Otherwise follow the instructions below.
0 commit comments