Skip to content

Commit d253b70

Browse files
authored
read/write/resample: improved support for GDAL complex file (#1000)
+ `readfile`: - `read_gdal()`: add `.cos` to the know list (for TerraSAR-X SLC data) - `read_binary_file()`: support returning complex data using `datasetName=complex`. - `read_binary_file()`: move `cpx_band` checking to expand its applicability from isce2 to generic files, including GDAL complex files. + `resample`: adjust size calculation in block-by-block IO for complex data type + `save_gdal`: - refactor `array2raster()` to `write_gdal()`, to be consistent with `utils.writefile.write()` in terms of input args. - use `readfile.DATA_TYPE_NUMPY2GDAL` to support non-float32 types, such as complex64 + docs: - update Yunjun's contact email from Gmail to Outlook - README: simplify styles for Yunjun et al. (2019) - CONTRIBUTING: add links to the to-do list and roadmap
1 parent adad64f commit d253b70

File tree

10 files changed

+110
-87
lines changed

10 files changed

+110
-87
lines changed

docs/CODE_OF_CONDUCT.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ further defined and clarified by project maintainers.
5555
## Enforcement
5656

5757
Instances of abusive, harassing, or otherwise unacceptable behavior may be
58-
reported by contacting the project team at `yunjunzgeo at gmail dot com` or `hersh.fattahi at gmail dot com`. All
58+
reported by contacting the project team at `yunjunz at outlook dot com` or `hersh.fattahi at gmail dot com`. All
5959
complaints will be reviewed and investigated and will result in a response that
6060
is deemed necessary and appropriate to the circumstances. The project team is
6161
obligated to maintain confidentiality with regard to the reporter of an incident.

docs/CONTRIBUTING.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,3 +108,7 @@ It takes about 15 mins to finish.
108108
## Things you should NOT do ##
109109

110110
(For anyone with push rights to github.com/insarlab/MintPy) Never modify a commit or the history of anything that has been committed to the `main` branch.
111+
112+
## Looking for ideas to contribute? ##
113+
114+
Feel ready and look for something to try? Check our [to-do list](https://github.com/insarlab/MintPy/wiki/To-do-list) and [roadmap](https://github.com/insarlab/MintPy/wiki/version-2-roadmap) for the next major version 2.0!

docs/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
[![Version](https://img.shields.io/github/v/release/insarlab/MintPy?color=yellow&label=version&style=flat-square)](https://github.com/insarlab/MintPy/releases)
77
[![Forum](https://img.shields.io/badge/forum-Google%20Groups-orange.svg?style=flat-square)](https://groups.google.com/g/mintpy)
88
[![License](https://img.shields.io/badge/license-GPLv3+-blue.svg?style=flat-square)](https://github.com/insarlab/MintPy/blob/main/LICENSE)
9-
[![Citation](https://img.shields.io/badge/DOI-10.1016%2Fj.cageo.2019.104331-blue?style=flat-square)](https://doi.org/10.1016/j.cageo.2019.104331)
9+
[![Citation](https://img.shields.io/badge/doi-10.1016%2Fj.cageo.2019.104331-blue?style=flat-square)](https://doi.org/10.1016/j.cageo.2019.104331)
1010

1111
## MintPy ##
1212

@@ -110,6 +110,6 @@ _This disclaimer was adapted from the [MetPy project](https://github.com/Unidata
110110

111111
### 6. Citing this work ###
112112

113-
Yunjun, Z., Fattahi, H., and Amelung, F. (2019), Small baseline InSAR time series analysis: Unwrapping error correction and noise reduction, _Computers & Geosciences_, _133_, 104331, doi:[10.1016/j.cageo.2019.104331](https://doi.org/10.1016/j.cageo.2019.104331), [arXiv](https://eartharxiv.org/9sz6m/), [data](https://doi.org/10.5281/zenodo.3464190), [notebooks](https://github.com/geodesymiami/Yunjun_et_al-2019-MintPy).
113+
Yunjun, Z., Fattahi, H., and Amelung, F. (2019), Small baseline InSAR time series analysis: Unwrapping error correction and noise reduction, _Computers & Geosciences_, _133_, 104331. [ [doi](https://doi.org/10.1016/j.cageo.2019.104331) \| [arxiv](https://doi.org/10.31223/osf.io/9sz6m) \| [data](https://doi.org/10.5281/zenodo.3464190) \| [notebook](https://github.com/geodesymiami/Yunjun_et_al-2019-MintPy) ]
114114

115115
In addition to the above, we recommend that you cite the original publications that describe the algorithms used in your specific analysis. They are noted briefly in the [default template file](../mintpy/defaults/smallbaselineApp.cfg) and listed in the [references.md file](./references.md).

docs/_config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,5 @@ google_analytics: UA-104225904-1
55
theme: jekyll-theme-cayman
66
author:
77
name: Zhang Yunjun, Heresh Fattahi
8-
url: https://yunjunzhang.wordpress.com
9-
author_email: yunjunzgeo@gmail.com
8+
url: https://yunjunz.github.io/
9+
author_email: yunjunz@outlook.com

docs/dask.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Most computations in MintPy are operated in either a pixel-by-pixel or a epoch-b
55
+ **local cluster:** on a single machine (laptop or computing node) with multiple CPU cores, suitable for laptops, local cluster/stations and distributed High Performance Cluster (HPC). No job scheduler is required.
66
+ **non-local cluster:** on a distributed HPC with job scheduler installed, including PBS, LSF and SLURM.
77

8-
Below is a brief description of the required options and recommended best practices for each cluster/scheduler.
8+
[Here](https://github.com/2gotgrossman/dask-rsmas-presentation) is an entry-level presentation on parallel computing using Dask by David Grossman. Below we brief describe for each cluster/scheduler the required options and recommended best practices.
99

1010
## 1. local cluster ##
1111

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
url="https://github.com/insarlab/MintPy",
3030
download_url=(f"https://github.com/insarlab/MintPy/archive/v{version}.tar.gz"),
3131
author="Zhang Yunjun, Heresh Fattahi",
32-
author_email="yunjunzgeo@gmail.com",
32+
author_email="yunjunz@outlook.com",
3333
license="GPL-3.0-or-later",
3434
license_files=("LICENSE",),
3535

src/mintpy/cli/save_gdal.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
save_gdal.py geo/geo_timeseries_ERA5_demErr.h5 -d 20200505_20200517 --of ENVI
1616
save_gdal.py geo/geo_ifgramStack.h5 -d unwrapPhase-20101120_20110220 --of ISCE
1717
save_gdal.py geo/geo_ifgramStack.h5 -d coherence-20101120_20110220 --of ISCE
18+
save_gdal.py geo_20230225.slc
1819
"""
1920

2021

src/mintpy/objects/resample.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -252,18 +252,22 @@ def get_num_box(src_file=None, max_memory=0, scale_fac=3.0):
252252
# auto split into list of boxes if max_memory > 0
253253
if src_file and os.path.isfile(src_file) and max_memory > 0:
254254
# get max dataset shape
255+
atr = readfile.read_attribute(src_file)
255256
fext = os.path.splitext(src_file)[1]
256257
if fext in ['.h5', '.he5']:
257258
with h5py.File(src_file, 'r') as f:
258259
ds_shapes = [f[i].shape for i in f.keys()
259260
if isinstance(f[i], h5py.Dataset)]
260261
max_ds_size = max(np.prod(i) for i in ds_shapes)
261262
else:
262-
atr = readfile.read_attribute(src_file)
263263
max_ds_size = int(atr['LENGTH']) * int(atr['WIDTH'])
264264

265+
# scale the size for complex arrays
266+
if atr.get('DATA_TYPE', 'float32').startswith('complex'):
267+
max_ds_size *= 6 if atr['DATA_TYPE'].endswith('128') else 3
268+
265269
# calc num_box
266-
num_box = int(np.ceil((max_ds_size * 4 * 4) / (max_memory * 1024**3)))
270+
num_box = int(np.ceil((max_ds_size * 4 * 6) / (max_memory * 1024**3)))
267271

268272
return num_box
269273

src/mintpy/save_gdal.py

Lines changed: 54 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -24,37 +24,66 @@
2424

2525

2626
##############################################################################
27-
def array2raster(array, out_file, transform, epsg, out_fmt='GTiff'):
28-
"""Write 2D matrix into gdal raster.
29-
30-
Parameters: array - 2D np.ndarray
31-
out_file - str, output file path
32-
transform - tuple(float), geotransform to connect image- to the geo-coordinates.
33-
https://gdal.org/tutorials/geotransforms_tut.html
34-
epsg - int, EPSG code for the Spatial Reference System (SRS)
35-
out_fmt - str, gdal driver name
36-
Returns: out_file - str, output file path
27+
def write_gdal(data, meta, out_file, out_fmt='GTiff'):
28+
"""Write 2D matrix into GDAL raster.
29+
30+
Parameters: data - 2D np.ndarray
31+
meta - dict, metadata used to calculate the transform / epsg info:
32+
X/Y_FIRST/STEP
33+
EPSG / UTM_ZONE
34+
out_file - str, output file path
35+
out_fmt - str, output GDAL file format (driver name)
36+
https://gdal.org/drivers/raster/index.html
37+
Returns: out_file - str, output file path
3738
"""
3839

40+
# prepare geotransform to connect image- to the geo-coordinates.
41+
# link: https://gdal.org/tutorials/geotransforms_tut.html
42+
transform = (
43+
float(meta['X_FIRST']), float(meta['X_STEP']), 0,
44+
float(meta['Y_FIRST']), 0, float(meta['Y_STEP']),
45+
)
46+
47+
# prepare EPSG code for the Spatial Reference System (SRS)
48+
if 'EPSG' in meta.keys():
49+
epsg = int(meta['EPSG'])
50+
elif 'UTM_ZONE' in meta.keys():
51+
epsg = int(ut.utm_zone2epsg_code(meta['UTM_ZONE']))
52+
else:
53+
epsg = 4326
54+
msg = 'No EPSG or UTM_ZONE metadata found! '
55+
msg += 'Assume EPSG = 4326 (WGS84) and continue.'
56+
warnings.warn(msg)
57+
58+
# write file
3959
driver = gdal.GetDriverByName(out_fmt)
4060
print(f'initiate GDAL driver: {driver.LongName}')
4161

42-
rows, cols = array.shape
43-
print('create raster band')
44-
print(f'raster row / column number: {rows}, {cols}')
45-
print(f'raster transform info: {transform}')
46-
raster = driver.Create(out_file, cols, rows, 1, gdal.GDT_Float32)
47-
raster.SetGeoTransform(transform)
62+
rows, cols = data.shape
63+
dtype = readfile.DATA_TYPE_NUMPY2GDAL[str(data.dtype)]
64+
print('create raster band:')
65+
print(f' raster row / column number: {rows}, {cols}')
66+
print(f' raster data type: {dtype} ({data.dtype})')
67+
raster = driver.Create(
68+
out_file,
69+
xsize=cols,
70+
ysize=rows,
71+
bands=1,
72+
eType=dtype,
73+
)
4874

49-
print('write data to raster band')
50-
band = raster.GetRasterBand(1)
51-
band.WriteArray(array)
75+
print(f'set transform info: {transform}')
76+
raster.SetGeoTransform(transform)
5277

5378
print(f'set projection as: EPSG {epsg}')
5479
srs = osr.SpatialReference()
5580
srs.ImportFromEPSG(epsg)
5681
raster.SetProjection(srs.ExportToWkt())
5782

83+
print('write data to raster band')
84+
band = raster.GetRasterBand(1)
85+
band.WriteArray(data)
86+
5887
band.FlushCache()
5988
print(f'finished writing to {out_file}')
6089

@@ -74,44 +103,20 @@ def save_gdal(inps):
74103

75104
ds_name = inps.dset if inps.dset else 'data'
76105
print(f'read {ds_name} from file: {inps.file}')
77-
data, atr = readfile.read(inps.file, datasetName=inps.dset)
106+
data, meta = readfile.read(inps.file, datasetName=inps.dset)
78107

79108
if ftype == 'timeseries' and inps.ref_date:
80109
print(f'read {inps.ref_date} from file: {inps.file}')
81110
data -= readfile.read(inps.file, datasetName=inps.ref_date)[0]
82111

83-
## prepare the output
112+
## write file
84113
# output file name
85114
if not inps.outfile:
86115
fbase = pp.auto_figure_title(inps.file, inps.dset, vars(inps))
87-
inps.outfile = fbase + GDAL_DRIVER2EXT.get(inps.out_format, '')
88-
else:
89-
inps.outfile = os.path.abspath(inps.outfile)
90-
91-
# geotransform
92-
transform = (
93-
float(atr['X_FIRST']), float(atr['X_STEP']), 0,
94-
float(atr['Y_FIRST']), 0, float(atr['Y_STEP']),
95-
)
116+
fext = GDAL_DRIVER2EXT.get(inps.out_format, '')
117+
inps.outfile = fbase + fext
118+
inps.outfile = os.path.abspath(inps.outfile)
96119

97-
# epsg
98-
if 'EPSG' in atr.keys():
99-
epsg = int(atr['EPSG'])
100-
elif 'UTM_ZONE' in atr.keys():
101-
epsg = int(ut.utm_zone2epsg_code(atr['UTM_ZONE']))
102-
else:
103-
epsg = 4326
104-
msg = 'No EPSG or UTM_ZONE metadata found! '
105-
msg += 'Assume EPSG = 4326 (WGS84) and continue.'
106-
warnings.warn(msg)
107-
108-
## write gdal raster
109-
array2raster(
110-
data,
111-
out_file=inps.outfile,
112-
transform=transform,
113-
epsg=epsg,
114-
out_fmt=inps.out_format,
115-
)
120+
write_gdal(data, meta, out_file=inps.outfile, out_fmt=inps.out_format)
116121

117122
return

src/mintpy/utils/readfile.py

Lines changed: 38 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,7 @@
119119
}
120120

121121
# 2 - GDAL
122+
# link: https://gdal.org/api/raster_c_api.html#_CPPv412GDALDataType
122123
DATA_TYPE_GDAL2NUMPY = {
123124
1 : 'uint8',
124125
2 : 'uint16',
@@ -131,21 +132,25 @@
131132
9 : 'cint32', # for translation purpose only, as numpy does not support complex int
132133
10: 'complex64',
133134
11: 'complex128',
135+
12: 'uint64',
136+
13: 'int64',
134137
}
135138

136139
DATA_TYPE_NUMPY2GDAL = {
137-
"uint8" : 1,
138-
"int8" : 1,
139-
"uint16" : 2,
140-
"int16" : 3,
141-
"uint32" : 4,
142-
"int32" : 5,
143-
"float32" : 6,
144-
"float64" : 7,
145-
"cint16" : 8, # for translation purpose only, as numpy does not support complex int
146-
"cint32" : 9, # for translation purpose only, as numpy does not support complex int
147-
"complex64" : 10,
148-
"complex128": 11,
140+
"uint8" : 1, # GDT_Byte
141+
"int8" : 1, # GDT_Int8 (GDAL >= 3.7)
142+
"uint16" : 2, # GDT_UInt16
143+
"int16" : 3, # GDT_Int16
144+
"uint32" : 4, # GDT_UInt32
145+
"int32" : 5, # GDT_Int32
146+
"float32" : 6, # GDT_Float32
147+
"float64" : 7, # GDT_Float64
148+
"cint16" : 8, # GDT_CInt16, for translation purpose only, as numpy does not support complex int
149+
"cint32" : 9, # GDT_CInt32, for translation purpose only, as numpy does not support complex int
150+
"complex64" : 10, # GDT_CFloat32
151+
"complex128": 11, # GDT_CFloat64
152+
"uint64" : 12, # GDT_UInt64 (GDAL >= 3.5)
153+
"int64" : 13, # GDT_Int64 (GDAL >= 3.5)
149154
}
150155

151156
# 3 - ISCE
@@ -173,7 +178,8 @@
173178
}
174179

175180
# single file (data + attributes) supported by GDAL
176-
GDAL_FILE_EXTS = ['.tiff', '.tif', '.grd']
181+
# .cos file - TerraSAR-X complex SAR data (https://gdal.org/drivers/raster/cosar.html)
182+
GDAL_FILE_EXTS = ['.tiff', '.tif', '.grd', '.cos']
177183

178184
ENVI_BAND_INTERLEAVE = {
179185
'BAND' : 'BSQ',
@@ -515,7 +521,20 @@ def read_binary_file(fname, datasetName=None, box=None, xstep=1, ystep=1):
515521

516522
# default data to read
517523
band = 1
518-
cpx_band = 'phase'
524+
if datasetName:
525+
if datasetName.startswith(('mag', 'amp')):
526+
cpx_band = 'magnitude'
527+
elif datasetName in ['phase', 'angle']:
528+
cpx_band = 'phase'
529+
elif datasetName.lower() == 'real':
530+
cpx_band = 'real'
531+
elif datasetName.lower().startswith('imag'):
532+
cpx_band = 'imag'
533+
else:
534+
cpx_band = 'complex'
535+
else:
536+
# use phase as default value, since it's the most common one.
537+
cpx_band = 'phase'
519538

520539
# ISCE
521540
if processor in ['isce']:
@@ -553,18 +572,6 @@ def read_binary_file(fname, datasetName=None, box=None, xstep=1, ystep=1):
553572
band = 2
554573
elif datasetName.lower() == 'band3':
555574
band = 3
556-
557-
elif datasetName.startswith(('mag', 'amp')):
558-
cpx_band = 'magnitude'
559-
elif datasetName in ['phase', 'angle']:
560-
cpx_band = 'phase'
561-
elif datasetName.lower() == 'real':
562-
cpx_band = 'real'
563-
elif datasetName.lower().startswith('imag'):
564-
cpx_band = 'imag'
565-
elif datasetName.startswith(('cpx', 'complex')):
566-
cpx_band = 'complex'
567-
568575
else:
569576
# flexible band list
570577
ds_list = get_slice_list(fname)
@@ -766,6 +773,8 @@ def get_hdf5_2d_dataset(name, obj):
766773
# Binary Files
767774
else:
768775
num_band = int(atr.get('BANDS', '1'))
776+
dtype = atr.get('DATA_TYPE', 'float32')
777+
769778
if fext in ['.trans', '.utm_to_rdc']:
770779
# roipac / gamma lookup table
771780
slice_list = ['rangeCoord', 'azimuthCoord']
@@ -777,7 +786,7 @@ def get_hdf5_2d_dataset(name, obj):
777786
elif fext in ['.unw', '.ion']:
778787
slice_list = ['magnitude', 'phase']
779788

780-
elif fext in ['.int', '.slc']:
789+
elif fext in ['.int', '.slc'] or (dtype.startswith('c') and num_band == 1):
781790
if no_complex:
782791
slice_list = ['magnitude', 'phase']
783792
else:
@@ -1979,8 +1988,8 @@ def read_gdal(fname, box=None, band=1, cpx_band='phase', xstep=1, ystep=1):
19791988
data = data.imag
19801989
elif cpx_band.startswith('pha'):
19811990
data = np.angle(data)
1982-
elif cpx_band.startswith('mag'):
1983-
data = np.absolute(data)
1991+
elif cpx_band.startswith(('mag', 'amp')):
1992+
data = np.abs(data)
19841993
elif cpx_band.startswith(('cpx', 'complex')):
19851994
pass
19861995
else:

0 commit comments

Comments
 (0)