You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Masking is a common operation in raster processing. It is setting certain cells to the @ref:[NoData value](nodata-handling.md). This is usually done to remove low-quality observations from the raster processing. Another related use case is to @ref:["clip"](masking.md#clipping) a raster to a given polygon.
17
+
18
+
In this section we will demonstrate two common schemes for masking. In Sentinel 2, there is a separate classification raster that defines low quality areas. In Landsat 8, several quality factors are measured and the indications are packed into a single integer, which we have to unpack.
17
19
18
-
## Masking Example
20
+
## Masking Sentinel 2
19
21
20
22
Let's demonstrate masking with a pair of bands of Sentinel-2 data. The measurement bands we will use, blue and green, have no defined NoData. They share quality information from a separate file called the scene classification (SCL), which delineates areas of missing data and probable clouds. For more information on this, see the [Sentinel-2 algorithm overview](https://earth.esa.int/web/sentinel/technical-guides/sentinel-2-msi/level-2a/algorithm). Figure 3 tells us how to interpret the scene classification. For this example, we will exclude NoData, defective pixels, probable clouds, and cirrus clouds: values 0, 1, 8, 9, and 10.
Because there is not a NoData already defined for the blue band, we must choose one. In this particular example, the minimum value is greater than zero, so we can use 0 as the NoData value. We will construct a new `CellType` object to represent this.
46
48
@@ -59,7 +61,7 @@ We next convert the blue band to this cell type.
Now we set cells of our `blue` column to NoData for all locations where the `scl` tile is in our set of undesirable values. This is the actual _masking_ operation.
65
67
@@ -89,7 +91,7 @@ And the original SCL data. The bright yellow is a cloudy region in the original
89
91
display(sample[1])
90
92
```
91
93
92
-
## Transferring Mask
94
+
### Transferring Mask
93
95
94
96
We can now apply the same mask from the blue column to the green column. Note here we have supressed the step of explicitly checking what a "safe" NoData value for the green band should be.
We will work with the Landsat scene [here](https://landsat-pds.s3.us-west-2.amazonaws.com/c1/L8/153/075/LC08_L1TP_153075_20190718_20190731_01_T1/index.html). For simplicity, we will just use two of the seven 30m bands. The quality mask for all bands is all contained in the `BQA` band.
Masking is described [on the Landsat Missions page](https://www.usgs.gov/land-resources/nli/landsat/landsat-collection-1-level-1-quality-assessment-band). It is pretty dense. Focus for this data set is the Collection 1 Level-1 for Landsat 8.
121
+
122
+
There are several inter-related factors to consider. In this exercise we will mask away the following.
123
+
124
+
* Designated Fill = yes
125
+
* Cloud = yes
126
+
* Cloud Shadow Confidence = Medium or High
127
+
* Cirrus Confidence = Medium or High
128
+
129
+
Note that you should consider your application and do your own exploratory analysis to determine the most appropriate mask!
130
+
131
+
According to the information on the Landsat site this translates to masking by bit values in the BQA according to the following table.
| Cloud shadow conf. | med / hi | 7-8 | 10, 11 (2, 3) |
138
+
| Cirrus conf. | med / hi | 11-12 | 10, 11 (2, 3) |
139
+
140
+
141
+
In this case, we will use the value of 0 as the NoData in the band data. Inspecting the associated [MTL txt file](https://landsat-pds.s3.us-west-2.amazonaws.com/c1/L8/153/075/LC08_L1TP_153075_20190718_20190731_01_T1/LC08_L1TP_153075_20190718_20190731_01_T1_MTL.txt) By inspection, we can discover that the minimum value in the band will be 1, thus allowing our use of 0 for NoData.
142
+
143
+
The code chunk below works through each of the rows in the table above. The first expression sets the cell type to have the selected NoData. The @ref:[`rf_mask_by_bit`](reference.md#rf-mask-by-bit) and @ref:[`rf_mask_by_bits`](reference.md#rf-mask-by-bits) functions extract the selected bit or bits from the `mask` cells and compare them to the provided values.
144
+
145
+
```python, build_l8_mask
146
+
l8_df = l8_df.withColumn('data_masked', # set to cell type that has a nodata
Clipping is the use of a polygon to determine the areas to mask in a raster. Typically the areas inside a polygon are retained and the cells outside are set to NoData. Given a geometry column on our DataFrame, we have to carry out three basic steps. First we have to ensure the vector geometry is correctly projected to the same @ref:[CRS](concepts.md#coordinate-reference-system-crs) as the raster. We'll continue with our example creating a simple polygon. Buffering a point will create an approximate circle.
168
+
Clipping is the use of a polygon to determine the areas to mask in a raster. Typically the areas inside a polygon are retained and the cells outside are set to NoData. Given a geometry column on our DataFrame, we have to carry out three basic steps. First we have to ensure the vector geometry is correctly projected to the same @ref:[CRS](concepts.md#coordinate-reference-system-crs) as the raster. We'll continue with our Sentinel 2 example, creating a simple polygon. Buffering a point will create an approximate circle.
0 commit comments