Skip to content

Commit 76e494d

Browse files
authored
Merge pull request #241 from s22s/docs/artifact-links
Add prospective links to getting started docs
2 parents 371e4c8 + b975828 commit 76e494d

File tree

4 files changed

+212
-879
lines changed

4 files changed

+212
-879
lines changed

pyrasterframes/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,13 @@ sbt 'pySetup test --addopts "-k test_tile_creation"'
153153
Or to build a specific document:
154154

155155
```bash
156-
sbt 'pySetup pweave -f docs/raster-io.pymd'
156+
sbt 'pySetup pweave -s docs/raster-io.pymd'
157+
```
158+
159+
Or to build a specific document with desired output format:
160+
161+
```bash
162+
sbt 'pySetup pweave -f notebook -s docs/numpy-pandas.pymd'
157163
```
158164

159165
*Note: You may need to run `sbt pyrasterframes/package` at least once for certain `pySetup` commands to work.*

pyrasterframes/src/main/python/docs/getting-started.pymd

Lines changed: 38 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
# Getting Started
22

3-
There are @ref:[several ways](getting-started.md#other-options) to use RasterFrames, and @ref:[several languages](languages.md) with which you can use it. Let's start with the simplest: the Python shell. Python 3.6 or greater is recommended.
3+
There are @ref:[several ways](getting-started.md#other-options) to use RasterFrames, and @ref:[several languages](languages.md) with which you can use it. Let's start with the simplest: the Python shell. To get started you will need:
4+
5+
1. [Python](https://www.python.org/) installed. Version 3.6 or greater is recommended.
6+
1. [`pip`](https://pip.pypa.io/en/stable/installing/) installed. If you are using Python 3, `pip` may already be installed.
7+
1. Java [JDK 8](https://openjdk.java.net/install/index.html) installed on your system and `java` on your system `PATH` or `JAVA_HOME` pointing to a Java installation.
48

59
## pip install pyrasterframes
610

@@ -26,13 +30,13 @@ df = spark.read.raster('https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/201
2630

2731
# Add 3 element-wise, show some rows of the dataframe
2832
df.select(rf_local_add(df.proj_raster, lit(3))).show(5, False)
29-
3033
```
3134

3235
This example is extended in the [getting started Jupyter notebook](https://nbviewer.jupyter.org/github/locationtech/rasterframes/blob/develop/rf-notebook/src/main/notebooks/Getting%20Started.ipynb).
3336

34-
To understand more about how and why RasterFrames represents Earth observation in DataFrames, read the project @ref:[description](description.md). For more hands-on examples, see the chapters about @ref:[reading](raster-io.md) and @ref:[processing](raster-processing.md) with RasterFrames.
37+
## Next Steps
3538

39+
To understand more about how and why RasterFrames represents Earth observation in DataFrames, read about the @ref:[core concepts](concepts.md) and the project @ref:[description](description.md). For more hands-on examples, see the chapters about @ref:[reading](raster-io.md) and @ref:[processing](raster-processing.md) with RasterFrames.
3640

3741
## Other Options
3842

@@ -43,39 +47,53 @@ You can also use RasterFrames in the following environments:
4347

4448
### Jupyter Notebook
4549

46-
**TODO** User facing quick instructions e.g. how to pull and run docker hub hosted container
50+
RasterFrames provides a Docker image for a Jupyter notebook server whose default kernel is already set up for running RasterFrames. To use it:
51+
52+
1. Install [docker](https://docs.docker.com/install/)
53+
1. Pull the image: `docker pull s22s/rasterframes-notebook`
54+
1. Run a container with the image, for example:
55+
56+
docker run -p 8808:8888 -p 44040:4040 -v /path/to/notebooks:/home/notebooks rasterframes-notebook:latest
4757

48-
See [RasterFrames Notebook README](https://github.com/locationtech/rasterframes/blob/develop/rf-notebook/README.md) for instructions on running a Jupyter notebook server within a Docker container that has a fully set up environment.
58+
1. In a browser, open `localhost:8808` in the example above.
4959

50-
### `pyspark` shell or app
60+
See [RasterFrames Notebook README](https://github.com/locationtech/rasterframes/blob/develop/rf-notebook/README.md) for instructions on building the Docker image for this Jupyter notebook server.
5161

52-
To initialize RasterFrames in a `pyspark` shell, prepare to call pyspark with the appropriate `--master` and other `--conf` arguments for your cluster manager and environment. To these you will add the PyRasterFrames assembly JAR and the python source zip.
62+
### `pyspark` shell
5363

54-
**TODO** how to build or download those artifacts.
64+
You can use RasterFrames in a `pyspark` shell. To set up the `pyspark` environment, prepare your call with the appropriate `--master` and other `--conf` arguments for your cluster manager and environment. For RasterFrames support you need to pass arguments pointing to the various Java dependencies. You will also need the Python source zip, even if you have pip installed the package. You can download the source zip here: https://repo1.maven.org/maven2/org/locationtech/rasterframes/pyrasterframes_2.11/${VERSION}/pyrasterframes_2.11-${VERSION}-python.zip.
65+
66+
The `pyspark` shell command will look something like this.
5567

5668
```bash
5769
pyspark \
58-
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
70+
--master local[*] \
71+
--py-files pyrasterframes_2.11-${VERSION}-python.zip \
72+
--packages org.locationtech.rasterframes:rasterframes_2.11:${VERSION},org.locationtech.rasterframes:pyrasterframes_2.11:${VERSION},org.locationtech.rasterframes:rasterframes-datasource_2.11:${VERSION}
73+
--conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ # these configs improve serialization performance
5974
--conf spark.kryo.registrator=org.locationtech.rasterframes.util.RFKryoRegistrator \
60-
--conf spark.kryoserializer.buffer.max=500m \
61-
--jars pyrasterframes/target/scala-2.11/pyrasterframes-assembly-${VERSION}.jar \
62-
--py-files pyrasterframes/target/scala-2.11/pyrasterframes-python-${VERSION}.zip
75+
--conf spark.kryoserializer.buffer.max=500m
6376
```
6477

65-
Then in the pyspark shell, import the module and call `withRasterFrames` on the SparkSession.
78+
Then in the `pyspark` shell, import the module and call `withRasterFrames` on the SparkSession.
6679

6780
```python, evaluate=False
68-
import pyrasterframes
69-
spark = spark.withRasterFrames()
70-
df = spark.read.raster('https://landsat-pds.s3.amazonaws.com/c1/L8/158/072/LC08_L1TP_158072_20180515_20180604_01_T1/LC08_L1TP_158072_20180515_20180604_01_T1_B5.TIF')
81+
Welcome to
82+
____ __
83+
/ __/__ ___ _____/ /__
84+
_\ \/ _ \/ _ `/ __/ '_/
85+
/__ / .__/\_,_/_/ /_/\_\ version 2.3.2
86+
/_/
87+
88+
Using Python version 3.7.3 (default, Mar 27 2019 15:43:19)
89+
SparkSession available as 'spark'.
90+
>>> import pyrasterframes
91+
>>> spark = spark.withRasterFrames()
92+
>>> df = spark.read.raster('https://landsat-pds.s3.amazonaws.com/c1/L8/158/072/LC08_L1TP_158072_20180515_20180604_01_T1/LC08_L1TP_158072_20180515_20180604_01_T1_B5.TIF')
7193
```
7294

7395
Now you have the configured SparkSession with RasterFrames enabled.
7496

75-
```python, echo=False
76-
spark.stop()
77-
```
78-
7997
## Installing GDAL
8098

8199
GDAL provides a wide variety of drivers to read data from many different raster formats. If GDAL is installed in the environment, RasterFrames will be able to @ref:[read](raster-read.md) those formats. If you are using the @ref:[Jupyter Notebook image](getting-started.md#jupyter-notebook), GDAL is already installed for you. Otherwise follow the instructions below. Version 2.4.1 or greater is required.

rf-notebook/src/main/notebooks/Getting Started.ipynb

Lines changed: 66 additions & 68 deletions
Large diffs are not rendered by default.

rf-notebook/src/main/notebooks/pretty_rendering_rf_types.tile.ipynb

Lines changed: 101 additions & 790 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)