Skip to content

Commit ebb2d79

Browse files
Merge pull request #74 from nsidc/41-support-other-ilvis2-lat-lon-elev-itrf-transforms
41 support other ilvis2 lat lon elev itrf transforms
2 parents 1f80121 + 620421e commit ebb2d79

File tree

12 files changed

+234
-31
lines changed

12 files changed

+234
-31
lines changed

CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,17 @@
1+
# v1.1.0
2+
3+
- Support selecting alternative lat/lon/elev triplet as primary lat/lon/elev
4+
fields for ILVIS2 data. By default, the `low_mode` coordinates are used, which
5+
represent the center of the lowest detected mode within the waveform. This
6+
replicates the behavior of the valkyrie service this code is based on. Users
7+
may now choose between the following coordinate sets: `low_mode` (the
8+
default), `high_mode`, `centroid` (ILVIS2 v1 only), and `highest_signal`
9+
(ILVIS2 v2 only).
10+
- Update documentation around ILVIS2 datasets and their multiple tuplets of
11+
lat/lon/elev fields.
12+
- Update data model for `IceflowDataFrame` to include `dataset`, which gives the
13+
dataset short name and version as a string (e.g., ILVIS v2 is "ILVISv2").
14+
115
# v1.0.0
216

317
- Update `transform_itrf` function to be more flexible. Both forward and reverse

docs/altimetry-data-overview.md

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ pulses are used to determine elevation data.
9191
```{note}
9292
9393
We recommend using the [_icepyx_](https://github.com/icesat2py/icepyx)
94-
library to access and interact with ICESat-2 data. Learn more about using `icepyx` with `icelfow` in the [Using iceflow with icepyx to Generate an Elevation Timeseries](notebooks/iceflow-with-icepyx) Jupyter notebook.
94+
library to access and interact with ICESat-2 data. Learn more about using `icepyx` with `iceflow` in the [Using iceflow with icepyx to Generate an Elevation Timeseries](notebooks/iceflow-with-icepyx) Jupyter notebook.
9595
9696
```
9797

@@ -117,7 +117,8 @@ further mission information and documentation for each data set:
117117
| [BLATM L1B](https://nsidc.org/data/BLATM1B) | South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 23 Jun. 1993 - 30 Oct. 2008 | Pre-IceBridge | ATM |
118118
| [ILATM L1B V1](https://nsidc.org/data/ILATM1B/versions/1) | South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 31 Mar. 2009 - 8 Nov. 2012 <br> (updated 2013) | IceBridge | ATM |
119119
| [ILATM L1B V2](https://nsidc.org/data/ILATM1B/versions/2) | South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 20 Mar. 2013 - 16 May 2019 <br> (updated 2020) | IceBridge | ATM |
120-
| [ILVIS2](https://nsidc.org/data/ILVIS2) | North: N:90, S: 60, E:180, W:-180 | 25 Aug. 2017 - 20 Sept. 2017 | IceBridge | ALTIMETERS, LASERS, LVIS |
120+
| [ILVIS2 v1](https://nsidc.org/data/ilvis2/versions/1) | South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 14 Apr. 2009 - 31 Oct. 2015 | IceBridge | ALTIMETERS, LASERS, LVIS |
121+
| [ILVIS2 v2](https://nsidc.org/data/ilvis2/versions/2) | North: N:90, S: 60, E:180, W:-180 | 25 Aug. 2017 - 20 Sept. 2017 | IceBridge | ALTIMETERS, LASERS, LVIS |
121122
| [GLAH06](https://nsidc.org/data/GLAH06/) | Global: N:86, S: -86, E:180, W:-180 | 20 Feb. 2003 - 11 Oct. 2009 | ICESat/GLAS | ALTIMETERS, CD, GLAS, GPS, <br> GPS Receiver, LA, PC |
122123

123124
---
@@ -129,6 +130,36 @@ guides or contact NSIDC user services at [email protected]
129130
130131
```
131132

133+
### ILVIS2 data
134+
135+
ILVIS2 contain multiple sets of latitude/longitude/elevation values.
136+
137+
- `GLAT`/`GLON`/`GZ` represent the center of the lowest mode in the waveform.
138+
- `HLAT`/`HLON`/`HZ` represent the center of the highest detected mode within
139+
the waveform. Both of these sets of lat/lon/elev are available across v1 and
140+
v2 ILIVS data.
141+
142+
ILVIS V1 data:
143+
144+
- `CLAT`/`CLON`/`ZC` represent the centroid of the corresponding LVIS Level-1B
145+
waveform.
146+
147+
ILVIS V2 data:
148+
149+
- `TLAT`/`TLON`/`ZT`, which represent the highest detected signal.
150+
151+
By default, `iceflow` will use `GLAT`/`GLON`/`GZ` as the primary
152+
latitude/longitude/elevation fields in `IceflowDataFrame`s. Use the
153+
`ilvis2_coordinate_set` kwarg on `read_iceflow_datafile(s)` or
154+
`make_iceflow_parquet` to select an different primary set of
155+
latitude/longitude/elevation fields. Alternatively, manually set the fields:
156+
157+
```
158+
# TLAT/TLON/TZ are only available in ILVIS2v2 data:
159+
sel_ilvis2v2 = data.dataset == "ILVIS2v2"
160+
data.loc[sel_ilvis2v2, ["latitude", "longitude", "elevation"]] = data.loc[sel_ilvis2v2, ["TLAT", "TLON", "ZT"]]
161+
```
162+
132163
## Challenges
133164

134165
The wealth of data from these missions presents an opportunity to study the
@@ -185,6 +216,9 @@ ICESat-2.
185216
- [OpenAltimetry](https://openaltimetry.earthdatacloud.nasa.gov/data/): Advanced
186217
discovery, processing, and visualization services for ICESat and ICESat-2
187218
altimeter data
219+
- [icepyx](https://icepyx.readthedocs.io/en/latest/): icepyx is both a software
220+
library and a community composed of ICESat-2 data users, developers, and the
221+
scientific community.
188222
- [ITS_LIVE](https://its-live.jpl.nasa.gov/): A NASA MEaSUREs project to provide
189223
automated, low latency, global glacier flow and elevation change data sets.
190224

docs/getting-started.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -101,14 +101,15 @@ from nsidc.iceflow import make_iceflow_parquet
101101
parquet_path = make_iceflow_parquet(
102102
data_dir=Path("/path/to/data/dir/"),
103103
target_itrf="ITRF2014",
104+
ilvis2_coordinate_set="low_mode",
104105
)
105106
df = dd.read_parquet(parquet_path)
106107
```
107108

108109
Note that `make_iceflow_parquet` creates a parquet datastore for the data in the
109110
provided `data_dir` with the data transformed into a common
110-
[ITRF](https://itrf.ign.fr/) to facilitate analysis. Only datetime, lat, lon,
111-
and elevation fields are preserved in the parquet datastore.
111+
[ITRF](https://itrf.ign.fr/) to facilitate analysis. Only datetime, latitude,
112+
longitude, elevation, and dataset fields are preserved in the parquet datastore.
112113

113114
To access and analyze the full data record in the source files, use
114115
[`read_iceflow_datafiles`](nsidc.iceflow.read_iceflow_datafiles):
@@ -117,7 +118,10 @@ To access and analyze the full data record in the source files, use
117118
from nsidc.iceflow import read_iceflow_datafiles
118119
119120
# Read all of the data in the source files - not just lat/lon/elev.
120-
df = read_iceflow_datafiles(downloaded_files)
121+
df = read_iceflow_datafiles(
122+
downloaded_files,
123+
ilvis2_coordinate_set="low_mode",
124+
)
121125
122126
# Optional: transform lat/lon/elev to common ITRF:
123127
from nsidc.iceflow import transform_itrf
@@ -130,3 +134,12 @@ df = transform_itrf(
130134
Note that `read_iceflow_datafiles` reads all of the data from the given
131135
filepaths. This could be a large amount of data, and could cause your program to
132136
crash if physical memory limits are exceeded.
137+
138+
#### Special considerations for ILVIS2 data
139+
140+
Users of ILVIS2 data should be aware that ILVIS2 data contains multiple sets of
141+
lat/lon/elev that may be of interest. By default, the `low_mode` set is used as
142+
the primary set of latitude/longitude/elevation used by `iceflow`.
143+
144+
See [ILVIS2 data](./altimetry-data-overview.md#ilvis2-data) for more
145+
information.

src/nsidc/iceflow/api.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
import dask.dataframe as dd
99
from loguru import logger
1010

11+
from nsidc.iceflow.data.ilvis2 import ILVIS2_DEFAULT_COORDINATE_SET
1112
from nsidc.iceflow.data.read import read_iceflow_datafiles
1213
from nsidc.iceflow.data.supported_datasets import ALL_SUPPORTED_DATASETS
1314
from nsidc.iceflow.itrf.converter import transform_itrf
@@ -19,6 +20,7 @@ def make_iceflow_parquet(
1920
target_itrf: str,
2021
overwrite: bool = False,
2122
target_epoch: str | None = None,
23+
ilvis2_coordinate_set=ILVIS2_DEFAULT_COORDINATE_SET,
2224
) -> Path:
2325
"""Create a parquet dataset containing the lat/lon/elev data in `data_dir`.
2426
@@ -52,7 +54,10 @@ def make_iceflow_parquet(
5254
]
5355
for subdir in all_subdirs:
5456
iceflow_filepaths = [path for path in subdir.iterdir() if path.is_file()]
55-
iceflow_df = read_iceflow_datafiles(iceflow_filepaths)
57+
iceflow_df = read_iceflow_datafiles(
58+
iceflow_filepaths,
59+
ilvis2_coordinate_set=ilvis2_coordinate_set,
60+
)
5661

5762
iceflow_df = transform_itrf(
5863
data=iceflow_df,
@@ -61,8 +66,6 @@ def make_iceflow_parquet(
6166
)
6267

6368
# Add a string col w/ dataset name and version.
64-
short_name, version = subdir.name.split("_")
65-
iceflow_df["dataset"] = [f"{short_name}v{version}"] * len(iceflow_df.latitude)
6669
common_columns = ["latitude", "longitude", "elevation", "dataset"]
6770
common_dask_df = dd.from_pandas(iceflow_df[common_columns])
6871
if parquet_subdir.exists():

src/nsidc/iceflow/data/atm1b.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -440,15 +440,19 @@ def atm1b_data(filepath: Path) -> ATM1BDataFrame:
440440
if year >= 2013:
441441
file_date = _ilatm1b_date(filename)
442442
data = _ilatm1bv2_data(filepath, file_date)
443+
dataset = "ILATM1Bv2"
443444
elif year >= 2009:
444445
file_date = _ilatm1b_date(filename)
445446
data = _atm1b_qfit_data(filepath, file_date)
447+
dataset = "ILATM1Bv1"
446448
else:
447449
file_date = _blatm1bv1_date(filename)
448450
data = _atm1b_qfit_data(filepath, file_date)
451+
dataset = "BLATM1Bv1"
449452

450453
itrf = extract_itrf(filepath)
451454
data["ITRF"] = itrf
455+
data["dataset"] = dataset
452456

453457
data = data.set_index("utc_datetime")
454458

src/nsidc/iceflow/data/glah06.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,7 @@ def glah06_data(filepath: Path) -> GLAH06DataFrame:
253253
df["latitude"] = df["d_lat"]
254254
df["longitude"] = df["d_lon"]
255255
df["elevation"] = df["d_elev"]
256+
df["dataset"] = "GLAH06v034"
256257

257258
# We index the data by utc datetime.
258259
df = df.set_index("utc_datetime")

src/nsidc/iceflow/data/ilvis2.py

Lines changed: 55 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,47 @@
44
import re
55
from collections import namedtuple
66
from pathlib import Path
7+
from typing import Literal
78

89
import numpy as np
910
import pandas as pd
1011
import pandera as pa
1112

1213
from nsidc.iceflow.data.models import ILVIS2DataFrame
1314

15+
ILVIS2_COORDINATE_SETS = Literal["low_mode", "high_mode", "centroid", "highest_signal"]
16+
ILVIS2_COORDINATE_SET_MAPPING: dict[ILVIS2_COORDINATE_SETS, dict[str, str]] = {
17+
# The center of the lowest detected mode within the waveform. Available in
18+
# both v1 and v2.
19+
"low_mode": {
20+
"latitude": "GLAT",
21+
"longitude": "GLON",
22+
"elevation": "ZG",
23+
},
24+
# The center of the highest detected mode within the waveform. Available in
25+
# both v1 and v2.
26+
"high_mode": {
27+
"latitude": "HLAT",
28+
"longitude": "HLON",
29+
"elevation": "ZH",
30+
},
31+
# Centroid of the corresponding LVIS Level-1B waveform. v1 only.
32+
"centroid": {
33+
"latitude": "CLAT",
34+
"longitude": "CLON",
35+
"elevation": "ZC",
36+
},
37+
# Highest detected signal. v2 only.
38+
"highest_signal": {
39+
"latitude": "TLAT",
40+
"longitude": "TLON",
41+
"elevation": "ZT",
42+
},
43+
}
44+
# In the valkyrie service that this code is based on, only the low_mode was
45+
# exposed to users. This default replicates that behavior.
46+
ILVIS2_DEFAULT_COORDINATE_SET: ILVIS2_COORDINATE_SETS = "low_mode"
47+
1448
Field = namedtuple("Field", ["name", "type", "scale_factor"])
1549

1650
"""
@@ -160,7 +194,10 @@ def _ilvis2_data(filepath: Path, file_date: dt.date, fields) -> pd.DataFrame:
160194

161195

162196
@pa.check_types()
163-
def ilvis2_data(filepath: Path) -> ILVIS2DataFrame:
197+
def ilvis2_data(
198+
filepath: Path,
199+
coordinate_set: ILVIS2_COORDINATE_SETS = ILVIS2_DEFAULT_COORDINATE_SET,
200+
) -> ILVIS2DataFrame:
164201
"""Return the ilvis2 data given a filepath.
165202
166203
Parameters
@@ -187,34 +224,39 @@ def ilvis2_data(filepath: Path) -> ILVIS2DataFrame:
187224

188225
if year < 2017:
189226
# This corresponds to ILVIS v1
227+
dataset = "ILVIS2v1"
190228
the_fields = ILVIS2_V104_FIELDS
191229
# The user guide indicates ILVIS2 v1 data uses ITRF2000 as a reference frame:
192230
# https://nsidc.org/sites/default/files/documents/user-guide/ilvis2-v001-userguide.pdf
193231
itrf_str = "ITRF2000"
232+
# Ensure that the highest_signal coordinate set has not been selected.
233+
if coordinate_set == "highest_signal":
234+
raise ValueError(
235+
"ILVIS coordinate set 'highest_signal' is only available in v2 data."
236+
)
194237
else:
195238
# This corresponds to ILVIS v2
239+
dataset = "ILVIS2v2"
196240
the_fields = ILVIS2_V202b_FIELDS
197241
# The user guide indicates ILVIS2 v1 data uses ITRF2008 as a reference frame:
198242
# https://nsidc.org/sites/default/files/documents/user-guide/ilvis2-v002-userguide.pdf
199243
itrf_str = "ITRF2008"
244+
# Ensure that the centroid coordinate set has not been selected.
245+
if coordinate_set == "centroid":
246+
raise ValueError(
247+
"ILVIS coordinate set 'centroid' is only available in v1 data."
248+
)
200249

201250
file_date = _file_date(filename)
202251

203252
data = _ilvis2_data(filepath, file_date, the_fields)
204253
data["ITRF"] = itrf_str
254+
data["dataset"] = dataset
205255

206-
# TODO: this data does not have a single set of latitude, longitude, and
207-
# elevation fields. Instead, it has e.g., "CLON" and "GLON" and "HLON". In
208-
# the original `valkyrie` service code, it looks like "GLON", "GLAT", and
209-
# "ZG" cols were used as for the points stored in the valkyrie database and
210-
# transformed by the ITRF transformation service. Ideally, we support
211-
# consistent transformation of the ITRF across all lat/lon/elev
212-
# fields. E.g., a user may be more interested in looking at the "CLON",
213-
# "CLAT", and "ZC" fields instead.
214-
# For now, we will replicate the behavior of `valkyrie`:
215-
data["latitude"] = data["GLAT"]
216-
data["longitude"] = data["GLON"]
217-
data["elevation"] = data["ZG"]
256+
coordinate_fields = ILVIS2_COORDINATE_SET_MAPPING[coordinate_set]
257+
data["latitude"] = data[coordinate_fields["latitude"]]
258+
data["longitude"] = data[coordinate_fields["longitude"]]
259+
data["elevation"] = data[coordinate_fields["elevation"]]
218260

219261
data = data.set_index("utc_datetime")
220262

src/nsidc/iceflow/data/models.py

Lines changed: 26 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ class CommonDataColumnsSchema(pa.DataFrameModel):
1717
latitude: Series[float] = pa.Field(coerce=True)
1818
longitude: Series[float] = pa.Field(coerce=True)
1919
elevation: Series[float] = pa.Field(coerce=True)
20+
# In practice, "dataset" is always available to provide provenance on where
21+
# each point comes from, but we mark it as optional here because it is not
22+
# strictly necessary for e.g., ITRF transformations.
23+
dataset: str | None
2024

2125

2226
class ATM1BSchema(CommonDataColumnsSchema):
@@ -38,22 +42,41 @@ class ATM1BSchema(CommonDataColumnsSchema):
3842
pulse_width: Series[float] = pa.Field(nullable=True, coerce=True)
3943

4044

41-
# Note/TODO: the ILVIS2 data contain multiple sets of lat/lon/elev. The common
45+
# Note: the ILVIS2 data contain multiple sets of lat/lon/elev. The common
4246
# schema assumes one set of lat/lon/elev which is used for the ITRF
4347
# transformation code.
4448
class ILVIS2Schema(CommonDataColumnsSchema):
49+
"""ILVIS2 Data Schema.
50+
51+
Note that ILVIS2 data contain multiple sets of lat/lon/elev.
52+
53+
* GLAT/GLON/GZ represent the center of the lowest mode in the waveform.
54+
* HLAT/HLON/HZ represent the center of the highest detected mode within the
55+
waveform. Both of these sets of lat/lon/elev are available across v1 and
56+
v2 ILVIS2 data.
57+
58+
ILVIS V1 data:
59+
* CLAT/CLON/ZC represent the centroid of the corresponding LVIS Level-1B waveform.
60+
61+
ILVIS V2 data:
62+
* TLAT/TLON/ZT, which represent the highest detected signal
63+
"""
64+
4565
# Common columns
4666
LFID: Series[float] = pa.Field(nullable=True, coerce=True)
4767
SHOTNUMBER: Series[float] = pa.Field(nullable=True, coerce=True)
4868
TIME: Series[float] = pa.Field(nullable=True, coerce=True)
69+
# ZG/GLAT/GLON: the center of the lowest detected mode within the waveform
4970
ZG: Series[float] = pa.Field(nullable=True, coerce=True)
5071
GLAT: Series[float] = pa.Field(nullable=True, coerce=True)
5172
GLON: Series[float] = pa.Field(nullable=True, coerce=True)
73+
# HLAT/HLON/ZH: the center of the highest detected mode within the waveform
5274
HLAT: Series[float] = pa.Field(nullable=True, coerce=True)
5375
HLON: Series[float] = pa.Field(nullable=True, coerce=True)
5476
ZH: Series[float] = pa.Field(nullable=True, coerce=True)
5577

5678
# V104-specific
79+
# CLAT/CLON/ZC: Centroid of the corresponding LVIS Level-1B waveform
5780
CLAT: Series[float] = pa.Field(nullable=True, coerce=True)
5881
CLON: Series[float] = pa.Field(nullable=True, coerce=True)
5982
ZC: Series[float] = pa.Field(nullable=True, coerce=True)
@@ -66,6 +89,7 @@ class ILVIS2Schema(CommonDataColumnsSchema):
6689
COMPLEXITY: Series[float] = pa.Field(nullable=True, coerce=True)
6790
INCIDENT_ANGLE: Series[float] = pa.Field(nullable=True, coerce=True)
6891
RANGE: Series[float] = pa.Field(nullable=True, coerce=True)
92+
# RH%%%: Height (relative to ZG) at which % of the waveform energy occurs
6993
RH10: Series[float] = pa.Field(nullable=True, coerce=True)
7094
RH15: Series[float] = pa.Field(nullable=True, coerce=True)
7195
RH20: Series[float] = pa.Field(nullable=True, coerce=True)
@@ -89,6 +113,7 @@ class ILVIS2Schema(CommonDataColumnsSchema):
89113
RH98: Series[float] = pa.Field(nullable=True, coerce=True)
90114
RH99: Series[float] = pa.Field(nullable=True, coerce=True)
91115
RH100: Series[float] = pa.Field(nullable=True, coerce=True)
116+
# Highest detected signal
92117
TLAT: Series[float] = pa.Field(nullable=True, coerce=True)
93118
TLON: Series[float] = pa.Field(nullable=True, coerce=True)
94119
ZT: Series[float] = pa.Field(nullable=True, coerce=True)

0 commit comments

Comments
 (0)