You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pyrasterframes/src/main/python/docs/getting-started.pymd
+50-10Lines changed: 50 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
There are @ref:[several ways](getting-started.md#other-options) to use RasterFrames, and @ref:[several languages](languages.md) with which you can use it. Let's start with the simplest: the Python shell. To get started you will need:
4
4
5
5
1. [Python](https://www.python.org/) installed. Version 3.6 or greater is recommended.
6
-
1. `pip` or `pip3` (recommended) installed. If you are using Python 3, `pip3` may already be installed.
6
+
1. [`pip`](https://pip.pypa.io/en/stable/installing/) installed. If you are using Python 3, `pip` may already be installed.
7
7
1. Java [JDK 8](https://openjdk.java.net/install/index.html) installed on your system and `java` on your system `PATH` or `JAVA_HOME` pointing to a Java installation.
This example is extended in the [getting started Jupyter notebook](https://nbviewer.jupyter.org/github/locationtech/rasterframes/blob/develop/rf-notebook/src/main/notebooks/Getting%20Started.ipynb).
37
36
37
+
## Next Steps
38
+
38
39
To understand more about how and why RasterFrames represents Earth observation in DataFrames, read about the @ref:[core concepts](concepts.md) and the project @ref:[description](description.md). For more hands-on examples, see the chapters about @ref:[reading](raster-io.md) and @ref:[processing](raster-processing.md) with RasterFrames.
39
40
40
41
## Other Options
@@ -60,35 +61,74 @@ See [RasterFrames Notebook README](https://github.com/locationtech/rasterframes/
60
61
61
62
### `pyspark` shell or app
62
63
63
-
To initialize RasterFrames in a `pyspark` shell, prepare to call pyspark with the appropriate `--master` and other `--conf` arguments for your cluster manager and environment. To these you will add the PyRasterFrames assembly JAR and the python source zip.
64
+
You can use RasterFrames in a `pyspark` shell or when submitting a `pyspark` app via a Python script. To set up the `pyspark` environment, prepare your call with the appropriate `--master` and other `--conf` arguments for your cluster manager and environment. To these you will add the PyRasterFrames assembly JAR and the python source zip.
64
65
65
66
You can either [build](https://github.com/locationtech/rasterframes/blob/develop/README.md) the artifacts or download them:
* The assembly JAR is embedded in the wheel file publised on pypi. Download the wheel from https://pypi.org/project/pyrasterframes/#files
71
+
* The wheel file is just a [zip file with .whl extension](https://www.python.org/dev/peps/pep-0427/); you can extract the assembly JAR with a command like this: `unzip -j $PYRF_WHEEL $(zipinfo -1 $PYRF_WHEEL | grep jar)`
72
+
73
+
74
+
#### Shell
69
75
76
+
The `pyspark` shell command will look something like this, replacing the `--jars` argument with the assembly jar and the `--py-files` with the source zip (not the wheel). To submit a script, add a .py file as the final argument
Now you have the configured SparkSession with RasterFrames enabled.
89
106
90
-
```python, echo=False
91
-
spark.stop()
107
+
#### Submitting Apps
108
+
109
+
Prepare the call to `spark-submit` in much the same way as using the `pyspark` shell. In the python script you submit, you will use the SparkSession builder pattern and add some RasterFrames extras to it. You have more flexibility in setting up configurations in either your script or in the `spark-submit` call.
0 commit comments