Skip to content

Commit 464721a

Browse files
committed
updated masking tutorial
1 parent e9a8cac commit 464721a

File tree

1 file changed

+129
-94
lines changed

1 file changed

+129
-94
lines changed

docs/source/howto/masks.rst

Lines changed: 129 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -9,21 +9,15 @@ This tutorial explains how to generate and use processing masks in the FORCE Hig
99

1010
.. admonition:: Info
1111

12-
*This tutorial uses FORCE v. 3.0*
13-
14-
15-
.. important::
16-
17-
EDIT: Part of this tutorial needs updating: Option 2.
18-
12+
*This tutorial uses FORCE v. 3.7.6*
1913

2014

2115
What are processing masks?
2216
--------------------------
2317

2418
In the FORCE Higher Level Processing System, processing masks can be used to restrict processing and analysis to certain pixels of interest.
2519
The masks need to be in datacube format, i.e. they need to be raster images in the same grid as all the other data.
26-
The masks can - but dont need to - be in the same directory as the other data.
20+
The masks can - but don't need to - be in the same directory as the other data.
2721
The masks should be binary images.
2822
The pixels that have a mask value of 0 will be skipped.
2923

@@ -40,49 +34,71 @@ What is the advantage of using processing masks?
4034
As an example, when computing a tree species classification, you can speed up processing substantially if you provide a forest masks.
4135
- Processing masks decrease data volume substantially.
4236
- In the processed products, the pixels of no interest have a nodata value.
43-
As all FORCE output is compressed (unless you choose to output in ENVI format; I dont recommend to do this), the compression kicks in nicely if you have used processing masks.
37+
As all FORCE output is compressed (unless you choose to output in ENVI format; I don't recommend to do this), the compression kicks in nicely if you have used processing masks.
4438
You can easily decrease data volume by several factors.
4539
- Processing masks facilitate analyzing the processed data.
4640
- In the processed products, the pixels of no interest have a nodata value.
47-
Thus, you dont need to sort the pixels on your own, e.g. computing confusion matrices and classification accuracy is more straightforward to implement.
41+
Thus, you don't need to sort the pixels on your own, e.g. computing confusion matrices and classification accuracy is more straightforward to implement.
4842

4943

5044
Generate processing masks
5145
-------------------------
5246

53-
Option 1: from shapefile to mask
54-
""""""""""""""""""""""""""""""""
47+
Option 1: from vector data to mask
48+
""""""""""""""""""""""""""""""""""
5549

56-
FORCE comes with a program to generate processing masks from a shapefile:
50+
FORCE comes with a program to generate processing masks from vector data (e.g. shapefile or geopackage):
5751

5852
.. code-block:: bash
5953
60-
force-cube
54+
force-cube -h
55+
56+
Usage: force-cube [-hvirsantobj] input-file(s)
57+
58+
optional:
59+
-h = show this help
60+
-v = show version
61+
-i = show program's purpose
62+
-r = resampling method
63+
any GDAL resampling method for raster data, e.g. cubic (default)
64+
is ignored for vector data
65+
-s = pixel resolution of cubed data, defaults to 10
66+
-a = optional attribute name for vector data. force-cube will burn these values
67+
into the output raster. default: no attribute is used; a binary mask
68+
with geometry presence (1) or absence (0) is generated
69+
-l = layer name for vector data (default: basename of input, without extension)
70+
-n = output nodate value (defaults to 255)
71+
-t = output data type (defaults to Byte; see GDAL for datatypes;
72+
but note that FORCE HLPS only understands Int16 and Byte types correctly)
73+
-o = output directory: the directory where you want to store the cubes
74+
defaults to current directory
75+
'datacube-definition.prj' needs to exist in there
76+
-b = basename of output file (without extension)
77+
defaults to the basename of the input-file
78+
cannot be used when multiple input files are given
79+
-j = number of jobs, defaults to 'as many as possible'
80+
81+
mandatory:
82+
input-file(s) = the file(s) you want to cube
83+
84+
-----
85+
see https://force-eo.readthedocs.io/en/latest/components/auxilliary/cube.html
6186
62-
Usage: force-cube input-file output-dir resample resolution
63-
input-file: the file you want to cube
64-
output-dir: the directory you want to store the cubes;
65-
datacube-definition.prj needs to exist in there
66-
resample: resampling method
67-
(1) any GDAL resampling method for raster data, e.g. cubic
68-
(2) rasterize for vector data
69-
resolution: the resolution of the cubed data
7087
7188
7289
``force-cube`` imports raster or vector data into the datacube format needed by FORCE.
7390
The output directory needs to contain a copy of the datacube definition (see datacube tutorial).
7491
75-
The ``rasterize`` resampling option rasterizes polygon vector geometries.
76-
It burns the occurence of the geometry into a raster image, i.e. it assigns the value *1* to all cells that are covered by a geometry, *0* if not.
92+
If used with vector data, the tool rasterizes the polygon vector geometries.
93+
By default, it burns the occurence of the geometry into a raster image, i.e. it assigns the value *1* to all cells that are covered by a geometry, *0* if not.
7794
The resulting masks are compressed GeoTiff images.
7895
Do not worry about data volume when converting from vector to raster data, because the compression rate is extremely high.
7996
8097
In the following example, we generate a processing mask for the administrative area of Vienna, Austria.
8198
82-
8399
.. code-block:: bash
84100
85-
force-cube vienna.shp /data/Dagobah/edc/misc/mask rasterize 10
101+
force-cube -o /data/europe/mask vienna.shp
86102
87103
0...10...20...30...40...50...60...70...80...90...100 - done.
88104
0...10...20...30...40...50...60...70...80...90...100 - done.
@@ -94,70 +110,88 @@ In this example, Vienna is covered by four tiles, a cubed GeoTiff was generated
94110
95111
.. code-block:: bash
96112
97-
ls /data/Dagobah/edc/misc/mask/X*/vienna.tif
113+
ls /data/europe/mask/X*/vienna.tif
98114
99-
/data/Dagobah/edc/misc/mask/X0077_Y0058/vienna.tif
100-
/data/Dagobah/edc/misc/mask/X0077_Y0059/vienna.tif
101-
/data/Dagobah/edc/misc/mask/X0078_Y0058/vienna.tif
102-
/data/Dagobah/edc/misc/mask/X0078_Y0059/vienna.tif
115+
/data/europe/mask/X0077_Y0058/vienna.tif
116+
/data/europe/mask/X0077_Y0059/vienna.tif
117+
/data/europe/mask/X0078_Y0058/vienna.tif
118+
/data/europe/mask/X0078_Y0059/vienna.tif
103119
104120
105121
For speedy visuailzation, build overviews and pyramids:
106122
107123
.. code-block:: bash
108124
109-
force-mosaic /data/Dagobah/edc/misc/mask
110-
force-pyramid /data/Dagobah/edc/misc/mask/mosaic/vienna.vrt
125+
force-pyramid /data/europe/mask/X*/*.tif
126+
force-mosaic /data/europe/mask
127+
128+
computing pyramids for vienna.tif
129+
0...10...20...30...40...50...60...70...80...90...100 - done.
130+
computing pyramids for vienna.tif
131+
0...10...20...30...40...50...60...70...80...90...100 - done.
132+
computing pyramids for vienna.tif
133+
0...10...20...30...40...50...60...70...80...90...100 - done.
134+
computing pyramids for vienna.tif
135+
0...10...20...30...40...50...60...70...80...90...100 - done.
111136
112137
mosaicking vienna.tif
113138
4 chips found.
114139
115-
computing pyramids for vienna.vrt
116-
0...10...20...30...40...50...60...70...80...90...100 - done.
117-
118140
119141
.. figure:: img/tutorial-mask-vector.jpg
120142
121143
*Mask of Vienna generated from a shapefile. Overlayed with the processing grid in green*
122144
123145
124-
Option 2: from raster to mask
125-
"""""""""""""""""""""""""""""
126-
127-
As of now, FORCE does not come with a handy tool to generate masks from a raster image with continuous values (this is on my to-do list though).
128-
However, you can follow this recipe to accomplish this.
129-
130-
.. important::
131-
132-
EDIT: This tool already exists for a while, ``force-procmask``. This part of the tutorial needs updating.
133-
146+
Option 2: from raster data to mask
147+
"""""""""""""""""""""""""""""""""
134148
135-
In the example given below, our input image is a multiband continuous fields dataset, which gives the percentages of built-up land (urban), high vegetation (trees), and low vegetation (grass, agriculture).
136-
Point 1) may be skipped if the data are already in datacube format, which is the case in this example.
149+
FORCE comes with a program to generate processing masks from a raster image with continuous values:
137150
138-
1. If the data are not already in the datacube format, use ``force-cube`` to import the data (see the usage above).
139-
Use a raster resampling option to trigger the raster import, e.g. ``cubic`` (bc it's all about cubes, eh?).
151+
.. code-block:: bash
140152
141-
2. Go to the parent directory of the cubed images (this is important for the next point), and generate a list with the filenames:
153+
force-procmask -h
142154
143-
.. code-block:: bash
155+
Usage: force-procmask [-sldobj] input-basename calc-expr
144156
145-
cd /data/Jakku/germany-LC/pred
146-
ls X*/CONFIELD_MLP.tif > files.txt
157+
optional:
158+
-s = pixel resolution of cubed data, defaults to 10
159+
-l = input-layer: band number in case of multi-band input rasters,
160+
defaults to 1
161+
-d = input directory: the datacube directory
162+
defaults to current directory
163+
'datacube-definition.prj' needs to exist in there
164+
-o = output directory: the directory where you want to store the cubes
165+
defaults to current directory
166+
-b = basename of output file (without extension)
167+
defaults to the basename of the input-file,
168+
appended by '_procmask'
169+
-j = number of jobs, defaults to 'as many as possible'
147170
171+
Positional arguments:
172+
- input-basename: basename of input data
173+
- calc-expr: Calculation in gdalnumeric syntax, e.g. 'A>2500'"
174+
The input variable is 'A'
175+
For details about GDAL expressions, see
176+
https://gdal.org/programs/gdal_calc.html
148177
149-
In this example, the image covers 597 tiles:
178+
-----
179+
see https://force-eo.readthedocs.io/en/latest/components/auxilliary/procmask.html
150180
151-
.. code-block:: bash
152181
153-
wc -l files.txt
182+
In the example given below, our input image is a multiband continuous fields dataset,
183+
which gives the percentages of built-up land (urban), high vegetation (trees), and low vegetation (grass, agriculture).
154184
155-
597 files.txt
185+
.. note::
186+
If the data are not already in the datacube format, use ``force-cube`` to import the data (see the usage above).
187+
Use a raster resampling option to trigger the raster import, e.g. ``cubic`` (bc it's all about cubes, eh?).
156188
189+
In our case, the data are already in datacube format, covering 597 tiles:
157190

158191
.. code-block:: bash
159192
160-
head files.txt
193+
cd /data/europe/pred
194+
ls X*/*.tif | head
161195
162196
X0052_Y0045/CONFIELD_MLP.tif
163197
X0052_Y0046/CONFIELD_MLP.tif
@@ -171,44 +205,38 @@ In this example, the image covers 597 tiles:
171205
X0053_Y0045/CONFIELD_MLP.tif
172206
173207
174-
3. Generate the masks using a command similar to the example below.
175-
The 1st part of the command uses the list from point 2), and parallely calls the command in parentheses ``"..."``.
176-
The curly braces ``{//}`` replace the input image with its dirname, i.e. with the tile ID.
177-
A directory for the tile is generated if it is not already existing.
178-
The ``gdal_calc.py`` command handles simple raster algebra.
179-
The ``-A`` and ``--A_band`` options specify the image and band on which to operate the calculation specified by ``--calc`` (in our input image, the tree percentage is in band 2).
180-
A binary image (= mask) will be generated, wherein all pixels larger than 3000 (i.e. 30%) are set to *1*.
181-
The ``--creation-option`` parameters are options that specify compression etc.
182-
The blocksize parameters should best reflect the blocksize used for the datacube (see datacube tutorial).
183-
*As said before, a tool for this will likely be implemented in a not-so-far future version of FORCE.*
208+
We generate the masks using ``force-procmask``, which internally uses ``gdal_calc.py`` for executing the raster algebra.
209+
Thus, the arithmetic expression must be given in gdalnumeric syntax, e.g. 'A>3000'.
210+
``A`` refers to our input image.
211+
If this is a multiband file, the desired band can be specified with the ``-b`` option
212+
(if not given, the first band is used).
213+
In our example input image, the tree percentage is in band 2 and the percentage values are scaled by 100 (i.e. 100% = 10000).
214+
To generate a mask with tree cover > 30%, we use the following:
184215

185216
.. code-block:: bash
186217
187-
parallel -a files.txt "mkdir -p /data/Dagobah/edc/misc/mask/{//}; gdal_calc.py -A {} --A_band=2 --outfile=/data/Dagobah/edc/misc/mask/{//}/forest-mask.tif --calc='(A>3000)' --NoDataValue=255 --type=Byte --format=GTiff --creation-option='COMPRESS=LZW' --creation-option='PREDICTOR=2' --creation-option='NUM_THREADS=ALL_CPUS' --creation-option='BIGTIFF=YES' --creation-option='BLOCKXSIZE=3000' --creation-option='BLOCKYSIZE=300'"
188-
189-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
190-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
191-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
192-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
193-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
194-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
195-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
196-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
197-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
198-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
199-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
200-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
201-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
202-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
203-
0 .. 10 .. 20 .. 30 .. 40 .. 50 .. 60 .. 70 .. 80 .. 90 .. 100 - Done
204-
...
218+
cd /data/europe/pred
219+
220+
force-procmask \
221+
-o /data/europe/mask \
222+
-b forest-mask \
223+
-l 2 \
224+
CONFIELD_MLP.tif \
225+
'A>3000'
226+
227+
228+
Computers / CPU cores / Max jobs to run
229+
1:local / 80 / 597
230+
231+
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
232+
ETA: 0s Left: 0 AVG: 0.00s local:0/597/100%/0.1s
205233
206234
207235
We now have one cubed mask for each input image in the mask directory:
208236

209237
.. code-block:: bash
210238
211-
ls /data/Dagobah/edc/misc/mask/X*/forest-mask.tif | wc -l
239+
ls /data/europe/mask/X*/forest-mask.tif | wc -l
212240
213241
597
214242
@@ -217,14 +245,21 @@ For speedy visuailzation, build overviews and pyramids:
217245

218246
.. code-block:: bash
219247
220-
force-mosaic /data/Dagobah/edc/misc/mask
221-
force-pyramid /data/Dagobah/edc/misc/mask/mosaic/forest-mask.vrt
248+
force-pyramid /data/europe/mask/X*/forest-mask.tif
249+
force-mosaic /data/europe/mask
250+
251+
computing pyramids for forest-mask.tif
252+
0...10...20...30...40...50...60...70...80...90...100 - done.
253+
computing pyramids for forest-mask.tif
254+
0...10...20...30...40...50...60...70...80...90...100 - done.
255+
computing pyramids for forest-mask.tif
256+
0...10...20...30...40...50...60...70...80...90...100 - done.
257+
computing pyramids for forest-mask.tif
258+
0...10...20...30...40...50...60...70...80...90...100 - done.
259+
...
222260
223261
mosaicking forest-mask.tif
224262
597 chips found.
225-
226-
computing pyramids for forest-mask.vrt
227-
0...10...20...30...40...50...60...70...80...90...100 - done.
228263
229264
230265
.. figure:: img/tutorial-mask-raster.jpg
@@ -241,7 +276,7 @@ To use the Vienna mask from above:
241276

242277
.. code-block:: bash
243278
244-
DIR_MASK = /data/Dagobah/edc/misc/mask
279+
DIR_MASK = /data/europe/mask
245280
BASE_MASK = vienna.tif
246281
247282

0 commit comments

Comments
 (0)