You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this code block we are using the @ref:[`rf_tile_mean`](reference.md#rf-tile-mean) function to compute the tile aggregate mean of cells in each row of column `tile`. The mean of each tile is computed separately, so the first mean is 1.0 and the second mean is 3.0. Notice that the number of rows in the DataFrame is the same before and after the aggregation.
31
+
In this code block, we are using the @ref:[`rf_tile_mean`](reference.md#rf-tile-mean) function to compute the tile aggregate mean of cells in each row of column `tile`. The mean of each tile is computed separately, so the first mean is 1.0 and the second mean is 3.0. Notice that the number of rows in the DataFrame is the same before and after the aggregation.
In this code block we are using the @ref:[`rf_agg_mean`](reference.md#rf-agg-mean) function to compute the DataFrame aggregate, which averages 25 values of 1.0 and 25 values of 3.0, across the fifty cells in two rows. Note that only a single row is returned since the average is computed over the full DataFrame.
37
+
In this code block, we are using the @ref:[`rf_agg_mean`](reference.md#rf-agg-mean) function to compute the DataFrame aggregate, which averages 25 values of 1.0 and 25 values of 3.0, across the fifty cells in two rows. Note that only a single row is returned since the average is computed over the full DataFrame.
In this code block we are using the @ref:[`rf_agg_local_mean`](reference.md#rf-agg-local-mean) function to compute the element-wise local aggregate mean across the two rows. In this example it is computing the mean of one value of 1.0 and one value of 3.0 to arrive at the element-wise mean, but doing so twenty-five times, one for each position in the `tile`.
43
+
In this code block, we are using the @ref:[`rf_agg_local_mean`](reference.md#rf-agg-local-mean) function to compute the element-wise local aggregate mean across the two rows. In this example it is computing the mean of one value of 1.0 and one value of 3.0 to arrive at the element-wise mean, but doing so twenty-five times, one for each position in the `tile`.
44
44
45
45
To compute an element-wise local aggregate, tiles need have the same dimensions as in the example below where both tiles have 5 rows and 5 columns. If we tried to compute an element-wise local aggregate over the DataFrame without equal tile dimensions, we would get a runtime error.
1. Pull the image: `docker pull s22s/rasterframes-notebook`
54
54
1. Run a container with the image, for example:
55
-
56
-
docker run -p 8808:8888 -p 44040:4040 -v /path/to/notebooks:/home/notebooks rasterframes-notebook:latest
57
-
55
+
`docker run -p 8808:8888 -p 44040:4040 -v /path/to/notebooks:/home/notebooks rasterframes-notebook:latest`
58
56
1. In a browser, open `localhost:8808` in the example above.
59
57
60
58
See [RasterFrames Notebook README](https://github.com/locationtech/rasterframes/blob/develop/rf-notebook/README.md) for instructions on building the Docker image for this Jupyter notebook server.
@@ -94,7 +92,11 @@ SparkSession available as 'spark'.
94
92
95
93
Now you have the configured SparkSession with RasterFrames enabled.
96
94
97
-
## Installing GDAL
95
+
```python, echo=False
96
+
spark.stop()
97
+
```
98
+
99
+
## Installing GDAL
98
100
99
101
GDAL provides a wide variety of drivers to read data from many different raster formats. If GDAL is installed in the environment, RasterFrames will be able to @ref:[read](raster-read.md) those formats. If you are using the @ref:[Jupyter Notebook image](getting-started.md#jupyter-notebook), GDAL is already installed for you. Otherwise follow the instructions below. Version 2.4.1 or greater is required.
100
102
@@ -111,7 +113,7 @@ brew install gdal
111
113
Using [`apt-get`](https://wiki.debian.org/Apt):
112
114
113
115
```bash
114
-
sudo apt-get update
116
+
sudo apt-get update
115
117
sudo apt-get install gdal-bin
116
118
```
117
119
@@ -133,4 +135,4 @@ from pyrasterframes.utils import gdal_version
133
135
print(gdal_version())
134
136
```
135
137
136
-
This will print out something like "GDAL x.y.z, released 20yy/mm/dd". If it reports `not available`, then GDAL isn't installed in a place where the RasterFrames runtime was able to find it. Please [file an issue](https://github.com/locationtech/rasterframes/issues) to get help resolving it.
138
+
This will print out something like "GDAL x.y.z, released 20yy/mm/dd". If it reports `not available`, then GDAL isn't installed in a place where the RasterFrames runtime was able to find it. Please [file an issue](https://github.com/locationtech/rasterframes/issues) to get help resolving it.
Copy file name to clipboardExpand all lines: pyrasterframes/src/main/python/docs/local-algebra.pymd
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -51,7 +51,7 @@ RasterFrames provides a wide variety of local map algebra functions. There are s
51
51
* A function on a Tile and a scalar is a binary operation; example: @ref:[rf_local_less](reference.md#rf-local-less); or
52
52
* A function on many Tiles is a n-ary operation; example: @ref:[rf_agg_local_min](reference.md#rf-agg-local-min)
53
53
54
-
We can express the normalized difference with a combination of `rf_local_divide`, `rf_local_subtract`, and `rf_local_add`. Since the normalized difference is so common there is a convenience method `rf_normalized_difference` which we use in this example. We will append a new column to the DataFrame, which will apply the map alegbra function to each row.
54
+
We can express the normalized difference with a combination of `rf_local_divide`, `rf_local_subtract`, and `rf_local_add`. Since the normalized difference is so common, there is a convenience method `rf_normalized_difference`, which we use in this example. We will append a new column to the DataFrame, which will apply the map alegbra function to each row.
Copy file name to clipboardExpand all lines: pyrasterframes/src/main/python/docs/nodata-handling.pymd
+32-28Lines changed: 32 additions & 28 deletions
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
## What is NoData?
4
4
5
-
In raster operations, the preservation and correct processing of missing observations is very important. In [most dataframes and scientific computing](https://www.oreilly.com/learning/handling-missing-data), the idea of missing data is expressed as a `null` or `NaN` value. A great deal of raster data is stored for space efficiency. This typically leads to use of integral values and a "sentinel" value to represent missing observations. This sentinel value varies across data products and is usually called the "NoData" value.
5
+
In raster operations, the preservation and correct processing of missing observations is very important. In [most DataFrames and scientific computing](https://www.oreilly.com/learning/handling-missing-data), the idea of missing data is expressed as a `null` or `NaN` value. A great deal of raster data is stored for space efficiency. This typically leads to use of integral values and a "sentinel" value to represent missing observations. This sentinel value varies across data products and is usually called the "NoData" value.
6
6
7
-
RasterFrames provides a variety of functions to inspect and manage NoData within `tile`s.
7
+
RasterFrames provides a variety of functions to inspect and manage NoData within `tile`s.
In this case, the minimum value of 0 is designated as the NoData value. For integralvalued cell types, the NoData is typically zero, the maximum, or the minimum value for the underlying data type. The NoData value can also be a user-defined value. In that case the value is designated with a `ud`.
58
+
In this case, the minimum value of 0 is designated as the NoData value. For integral-valued cell types, the NoData is typically zero, the maximum, or the minimum value for the underlying data type. The NoData value can also be a user-defined value. In that case the value is designated with a `ud`.
Let's continue the example above with Sentinel-2 data. Band 2 is blue and has no defined NoData. The quality information is in a separate file called the scene classification (SCL), which delineates areas of missing data and probable clouds. For much more information on that, see the [Sentinel-2 algorithm overview](https://earth.esa.int/web/sentinel/technical-guides/sentinel-2-msi/level-2a/algorithm). Figure 3 tells us how to interpret the scene classification. For this example, we will exclude NoData, defective pixels, probable clouds, and cirrus clouds: values 0, 1, 8, 9, and 10.
73
+
Let's continue the example above with Sentinel-2 data. Band 2 is blue and has no defined NoData. The quality information is in a separate file called the scene classification (SCL), which delineates areas of missing data and probable clouds. For more information on that, see the [Sentinel-2 algorithm overview](https://earth.esa.int/web/sentinel/technical-guides/sentinel-2-msi/level-2a/algorithm). Figure 3 tells us how to interpret the scene classification. For this example, we will exclude NoData, defective pixels, probable clouds, and cirrus clouds: values 0, 1, 8, 9, and 10.
74
74
75
-
The first step is to create a catalog with our band of interest and the SCL band. We read the data from the catalog and now the blue band and SCL tiles are aligned across rows.
75
+

The first step is to create a catalog with our band of interest and the SCL band. We read the data from the catalog, so the blue band and SCL tiles are aligned across rows.
Drawing on @ref:[local map algebra](local-algebra.md) techniques, we will create a new tile column containing our indicator of unwanted pixels, as defined above.
92
+
Drawing on @ref:[local map algebra](local-algebra.md) techniques, we will create new tile columns that are indicators of unwanted pixels, as defined above. Since the mask column is bit type, the addition is equivalent to a logical or, so the true values are 1.
Now we will use the @ref:[`rf_mask_by_value`](reference.md#rf-mask-by-value) to designate the cloudy and other unwanted pixels as NoData in the blue column. Because there is not a NoData already defined, we will choose one. Note that in this particular example the minimum value is greater than zero, so we can use 0 as the NoData value.
111
+
Because there is not a NoData already defined, we will choose one. In this particular example, the minimum value is greater than zero, so we can use 0 as the NoData value.
Convert the cell type and apply the mask. Since the mask column is bit type, the addition done above was equivalent to a logical or. So the true values are 1.
125
+
Now we will use the @ref:[`rf_mask_by_value`](reference.md#rf-mask-by-value) to designate the cloudy and other unwanted pixels as NoData in the blue column by converting the cell type and applying the mask.
We can verify that the number of NoData cells in the resulting `blue_masked` column matches the total of the bit-type `mask` tile.
134
+
We can verify that the number of NoData cells in the resulting `blue_masked` column matches the total of the bit-type `mask` tile to ensure our logic is correct.
Let's now explore how the presence of NoData affects @ref:[local map algebra](local-algebra.md) operations. To demonstrate the behaviour, lets create two tiles. One tile will have values of 0 and 1, and the other will have values of just 0.
155
+
Let's now explore how the presence of NoData affects @ref:[local map algebra](local-algebra.md) operations. To demonstrate the behaviour, lets create two tiles. One tile will have values of 0 and 1, and the other will have values of just 0.
152
156
153
157
154
158
```python
@@ -168,7 +172,7 @@ print('y')
168
172
display(y)
169
173
```
170
174
171
-
Now, let's create a new column from `x` with the value of 1 changed to NoData. Then, we will add this new column with NoData to the `y` column. As shown below, the result of the sum also has NoData (represented in white). In general for local algebra operations, Data + NoData = NoData.
175
+
Now, let's create a new column from `x` with the value of 1 changed to NoData. Then, we will add this new column with NoData to the `y` column. As shown below, the result of the sum also has NoData (represented in white). In general for local algebra operations, Data + NoData = NoData.
RasterFrames supports having Tile columns with multiple cell types in a single DataFrame. It is important to understand how these different cell types interact.
239
+
RasterFrames supports having Tile columns with multiple cell types in a single DataFrame. It is important to understand how these different cell types interact.
236
240
237
-
Let's first create a RasterFrame that has columns of `float` and `int` cell type.
241
+
Let's first create a RasterFrame that has columns of `float` and `int` cell type.
238
242
239
243
```python
240
244
x = Tile((np.ones((100, 100))*2).astype('float'))
@@ -248,9 +252,9 @@ When performing a local operation between tile columns with cell types `int` and
Let's try adding the tile columns with different NoData values. When there is an inconsistent NoData value in the two columns, the NoData value of the right-hand side of the sum is kept. In this case, this means the result has a NoData value of 1.
269
+
Let's try adding the tile columns with different NoData values. When there is an inconsistent NoData value in the two columns, the NoData value of the right-hand side of the sum is kept. In this case, this means the result has a NoData value of 1.
0 commit comments