You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: book/background/1_context_motivation.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ Technological developments in recent decades have engendered fundamental shifts
13
13
14
14
## *Increasingly large, cloud-optimized data means new tools and approaches for data management*
15
15
16
-
The increasing volume of publicly available earth observation data has transformed scientific workflows across a range of fields, prompting analysts to gain new skills in order to work with larger volumes of data in new formats and locations, and to use distributed cloud-computational resources in their analysis ({cite:t}`abernathey_2021_cloud,Boulton02012018,gentemann_2021_science,mathieu_2017_esas,ramachandran_2021_open,Sudmanns_2020_big,wagemann_2021_user,wagemann_2022_FiveGuidingPrinciples`). {numref}`eo_data_trend` shows the recent trend and projected continued increases in the volume of NASA Earth Science data archives. New satellites like [NISAR](https://nisar.jpl.nasa.gov/) will add to the growth of the data archives.
16
+
The increasing volume of publicly available earth observation data has transformed scientific workflows across a range of fields, prompting analysts to gain new skills in order to work with larger volumes of data in new formats and locations and to use distributed cloud-computational resources in their analysis ({cite:t}`abernathey_2021_cloud,Boulton02012018,gentemann_2021_science,mathieu_2017_esas,ramachandran_2021_open,Sudmanns_2020_big,wagemann_2021_user,wagemann_2022_FiveGuidingPrinciples`). {numref}`eo_data_trend` shows the recent trend and projected continued increases in the volume of NASA Earth Science data archives. New satellites like [NISAR](https://nisar.jpl.nasa.gov/) will contribute to the growth of data archives.
17
17
18
18
```{figure} imgs/fy24-projection-chart.png
19
19
---
@@ -24,9 +24,9 @@ Volume of NASA Earth Science Data archives, including growth of existing-mission
24
24
25
25
## *Asking questions of complex datasets*
26
26
27
-
Scientific workflows involve asking complex questions of diverse types of data. Earth observation and related datasets often contain two types of information: measurements of a physical observable (e.g. temperature) and metadata that provides auxiliary information that required in order to interpret the physical observable (time and location of measurement, information about the sensor, etc.). With the increasingly complex and large volume of earth observation data that is currently available, storing, managing and organizing this information can very quickly become a complex and challenging task, especially for students and early-career analysts ({cite:t}`mathieu_2017_esas,palumbo_2017_building,Sudmanns_2020_big,wagemann_2021_user,stern_2022_PangeoForge`).
27
+
Scientific workflows involve asking complex questions of diverse types of data. Earth observation and related datasets often contain two types of information: measurements of a physical observable (e.g. temperature) and metadata that provides auxiliary information that is required in order to interpret the physical observable (time and location of measurement, information about the sensor, etc.). With the increasingly complex and large volume of earth observation data that is currently available, storing, managing, and organizing this information can very quickly become a complex and challenging task, especially for students and early-career analysts ({cite:t}`mathieu_2017_esas,palumbo_2017_building,Sudmanns_2020_big,wagemann_2021_user,stern_2022_PangeoForge`).
28
28
29
-
This book provides detailed examples of scientific workflow steps that ingest complex, multi-dimensional datasets, introduce users to the landscape of popular, actively-maintained open-source software packages for working with geospatial data in Python, and include strategies for working with larger-thanmemory data stored in publicly available, cloud-hosted repositories. These demonstrations are accompanied by detailed discussion of concepts involved in analyzing earth observation data such as dataset inspection, manipulation, and exploratory analysis and visualization. Overall, we emphasize the importance of understanding the structure of multi-dimensional earth observation datasets within the context of a given data model and demonstrate how such an understanding can enable more efficient and intuitive scientific workflows.
29
+
This book provides detailed examples of scientific workflow steps that ingest complex, multi-dimensional datasets, introduce users to the landscape of popular, actively-maintained open-source software packages for working with geospatial data in Python, and include strategies for working with larger-than-memory data stored in publicly available, cloud-hosted repositories. These demonstrations are accompanied by detailed discussions of concepts involved in analyzing earth observation data, such as dataset inspection, manipulation, and exploratory analysis and visualization. Overall, we emphasize the importance of understanding the structure of multi-dimensional earth observation datasets within the context of a given data model and demonstrate how such an understanding can enable more efficient and intuitive scientific workflows.
Copy file name to clipboardExpand all lines: book/background/3_tutorials_overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ This book contains two distinct tutorials, each of which focuses on a different
4
4
5
5
## *Part 1: ITS_LIVE ice velocity data tutorial*
6
6
7
-
This tutorial focuses on a dataset of ice velocity observations derived from satellite image pairs, using a number of different satellite sensors. This dataset is accessed as Zarr data cubes from AWS S3 cloud object storage. The notebooks in this tutorial focus on:
7
+
This tutorial focuses on a dataset of ice velocity observations derived from satellite image pairs using a number of different satellite sensors. This dataset is accessed as Zarr data cubes from AWS S3 cloud object storage. The notebooks in this tutorial focus on:
8
8
9
9
1) Querying a JSON catalog and reading data from cloud object storage,
10
10
2) Working with larger-than-memory data,
@@ -14,7 +14,7 @@ This tutorial focuses on a dataset of ice velocity observations derived from sat
14
14
15
15
## *Part 2: Sentinel-1 RTC imagery tutorial*
16
16
17
-
This tutorial focuses on data from Sentinel-1, a synthetic aperture radar (SAR) dataset containing imagery collected at C-band. Specifically, we are looking at Sentinel-1 Radiometric Terrain Corrected (RTC) imagery (for more detail on this, see [tutorial data](4_tutorial_data.md)). We demonstrate how to access and work with two Sentinel-1 RTC datasets as well as how to set up and perform an initial comparison between the two and time series analysis of Sentinel-1 backscatter variability. These notebooks cover:
17
+
This tutorial focuses on data from Sentinel-1, a synthetic aperture radar (SAR) dataset containing imagery collected at C-band. Specifically, we are looking at Sentinel-1 Radiometric Terrain Corrected (RTC) imagery (for more detail on this, see [tutorial data](4_tutorial_data.md)). We demonstrate how to access and work with two Sentinel-1 RTC datasets, as well as how to set up and perform an initial comparison between the two. These notebooks cover:
18
18
19
19
1) Reading and working with a very large dataset (stored locally) in memory. This includes steps such as:
20
20
- Reconstructing metadata lost during the read step,
Copy file name to clipboardExpand all lines: book/background/4_tutorial_data.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@
2
2
3
3
We use many different datasets throughout these tutorials. While each tutorial is focused on a different raster time series (ITS_LIVE ice velocity data and Sentinel-1 imagery), we also use vector data to represent points of interest.
4
4
5
-
Most of the examples in this book use data accessed programmatically from cloud-object storage. We make subset of the data available in this books Github repository to remove the need for computationally-intensive operations in the tutorials. In one example, working with Sentinel-1 data processed by Alaska Satellite Facility, we start with data downloaded locally. Users who would like to complete this processing step on their own may do so (and access the data [here](https://zenodo.org/records/15036782)), but a smaller subset of this data is stored in the repository.
5
+
Most of the examples in this book use data accessed programmatically from cloud-object storage. We make subsets of the data available in this book's Github repository to remove the need for computationally-intensive operations in the tutorials. In one example, working with Sentinel-1 data processed by Alaska Satellite Facility, we start with data downloaded locally. Users who would like to complete this processing step on their own may do so (and access the data [here](https://zenodo.org/records/15036782)), but a smaller subset of this data is stored in the repository.
6
6
7
-
Here is a broad overview the data included in this tutorial, including how it is collected, it's potential scientific applications, and how and where it is stored and accessed in these tutorials.
7
+
Here is a broad overview of the data included in this tutorial, including how it is collected, its potential scientific applications, and how and where it is stored and accessed in these tutorials.
8
8
9
9
## *Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE)*
10
10
@@ -13,25 +13,25 @@ Here is a broad overview the data included in this tutorial, including how it is
13
13
| ITS_LIVE |[ITS_LIVE project, NASA JPL](https://its-live.jpl.nasa.gov/)| Zarr | AWS S3|
14
14
15
15
16
-
ITS_LIVE is a dataset of ice velocity observations derived from applying a feature tracking algorithm to pairs of satellite imagery. Ice velocity refers to the down-slope movement of glaciers and ice sheets {cite}`Gardner_Scambos_2022`. Because glaciers and ice sheets are dynamic elements of our climate system, they lose or gain mass in response to changes in climate conditions such as warmer temperatures or increased snowfall, measuring variability in the speed of ice flow can help scientists better understand trends in glacier dynamics and interactions between glaciers and climate.
16
+
ITS_LIVE is a dataset of ice velocity observations derived from applying a feature tracking algorithm to pairs of satellite imagery. Ice velocity refers to the down-slope movement of glaciers and ice sheets {cite}`Gardner_Scambos_2022`. Because glaciers and ice sheets are dynamic elements of our climate system, they lose or gain mass in response to changes in climate conditions, such as warmer temperatures or increased snowfall; measuring variability in the speed of ice flow can help scientists better understand the relationship between glaciers and climate.
17
17
18
18
```{figure} imgs/lopez06-3341335.png
19
19
---
20
20
name: ITS_LIVE-time-series
21
21
---
22
-
Example of a ice velocity time series along a profile of Malaspina Glacier featuring velocity observations from a range of satellite sensors. Source: Reproduced with permission from {cite:t}`lopez_2023_itslive`.
22
+
Example of an ice velocity time series along a profile of Malaspina Glacier featuring velocity observations from a range of satellite sensors. Source: Reproduced with permission from {cite:t}`lopez_2023_itslive`.
23
23
```
24
24
25
-
{numref}`ITS_LIVE-time-series` shows an ITS_LIVE time series at various locations on the Malaspina glacier and the satellite sensors that contribute observations throughout the time series. Part of what is so exciting about ITS_LIVE is that it combines image pairs from a number of satellites, including imagery from optical (Landsat 4,5,7,8,9 & Sentinel-2) and synthetic aperture radar (Sentinel-1) sensors. For this reason, ITS_LIVE time series data can be quite large. Another exciting aspect of the ITS_LIVE dataset is that the image pair time series data is made available as Zarr data cubes stored in cloud object storage on Amazon Web Services (AWS), meaning that users don't need to download massive files to start working with the data!
25
+
{numref}`ITS_LIVE-time-series` shows an ITS_LIVE time series at various locations on the Malaspina glacier and the satellite sensors contributing observations throughout the time series. Part of what is so exciting about ITS_LIVE is that it combines image pairs from a number of satellites, including imagery from optical (Landsat 4,5,7,8,9 & Sentinel-2) and synthetic aperture radar (Sentinel-1) sensors. For this reason, ITS_LIVE time series data can be quite large. Another exciting aspect of the ITS_LIVE dataset is that the image pair time series data is made available as Zarr data cubes stored in cloud object storage on Amazon Web Services (AWS), meaning that users don't need to download massive files to start working with the data!
26
26
27
27
28
28
:::{admonition} A note about working with image pair time series
29
-
ITS_LIVE is an ice velocity time series where observations are derived from image pairs, meaning that an observation captures all movement that occurs between the two image acquisitions. In this tutorial, we focus on demonstrating the basics of dataset manipulation, examination and preliminary visualization; we index observations off of their mid-date and do not take the time between the images into account. For detailed time series analysis of ice velocity, this point should be considered when making decisions about which observations to include in analysis for different scientific objectives and how to perform aggregation and resampling operations.
29
+
ITS_LIVE is an ice velocity time series where observations are derived from image pairs, meaning that an observation captures all movement that occurs between the two image acquisitions. In this tutorial, we focus on demonstrating the basics of dataset manipulation, examination, and preliminary visualization; we index observations off of their mid-date and do not take the time between the images into account. For detailed time series analysis of ice velocity, this point should be considered when making decisions about which observations to include in analysis for different scientific objectives and how to perform aggregation and resampling operations.
30
30
31
-
For a comprehensive approach to produce regularized ice velocity estimates from an ITS_LIVE time series, we direct the interested reader to {cite:t}`charrier_2025_TICOI`.
31
+
For a comprehensive approach to producing regularized ice velocity estimates from an ITS_LIVE time series, we direct the interested reader to {cite:t}`charrier_2025_TICOI`.
32
32
:::
33
33
34
-
ITS_LIVE produces a number of data products in addition to the image pair time series that we use in this tutorial, and provides different options to access the data. Check them out [here](https://its-live.jpl.nasa.gov/#access).
34
+
ITS_LIVE produces a number of data products in addition to the image pair time series that we use in this tutorial and provides different options to access the data. Check them out [here](https://its-live.jpl.nasa.gov/#access).
35
35
36
36
**Documentation & References**:
37
37
Be sure to also check out the ITS_LIVE image pair velocities [documentation](http://its-live-data.jpl.nasa.gov.s3.amazonaws.com/documentation/ITS_LIVE-Landsat-Scene-Pair-Velocities-v01.pdf) and papers on the ITS_LIVE processing methodology:
@@ -44,15 +44,15 @@ Be sure to also check out the ITS_LIVE image pair velocities [documentation](htt
Part 2 focuses on Sentinel-1 Radiometric Terrain Corrected imagery. Sentinel-1 is a dataset of synthetic aperture radar (SAR) imagery collected from sensors located on satellites operated by the Sentinel satellites operated by the European Space Agency (ESA). SAR data is exciting because doesn't require solar illumination like passive optical systems and, at the wavelength where Sentinel-1 imagery is collected, it is minimally impacted by atmospheric water vapor, meaning that Sentinel-1 can acquire clear images of Earth's surface even during cloudy and nighttime conditions. SAR imagery has a wide range of scientific applications including monitoring land surface deformation related to seismic activities, tracking flooding extents following extreme weather events, and mapping deforestation and characterizing biomass.
47
+
Part 2 focuses on Sentinel-1 Radiometric Terrain Corrected imagery. Sentinel-1 is a dataset of synthetic aperture radar (SAR) imagery collected from sensors located on satellites operated by the Sentinel satellites operated by the European Space Agency (ESA). SAR data is exciting because it doesn't require solar illumination like passive optical systems and, at the wavelength where Sentinel-1 imagery is collected, it is minimally impacted by atmospheric water vapor, meaning that Sentinel-1 can acquire clear images of Earth's surface even during cloudy and nighttime conditions. SAR imagery has a wide range of scientific applications, including monitoring land surface deformation related to seismic activities, tracking flooding extents following extreme weather events, mapping deforestation, and characterizing biomass.
48
48
49
49
:::{tip}
50
50
For an in-depth example of how SAR backscatter data can be used to map flooding extent, check out this [notebook](https://projectpythia.org/eo-datascience-cookbook/notebooks/tutorials/floodmapping.html) in the [Project Pythia Earth Observation Data Science Cookbook](https://projectpythia.org/eo-datascience-cookbook/README.html).
51
51
:::
52
52
53
53
Because SAR imagery is collected from a side-looking sensor, it can contain distortions related to the viewing geometry of the sensor and the surface topography of the area being imaged. This tutorial focuses on RTC imagery, which is SAR data that has undergone processing to remove the above-mentioned distortions.
54
54
55
-
Multiple algorithms perform radiometric terrain correction, and it is important to understand the components of whichever dataset you use and their relative benefits and tradeoffs. This book will demonstrate working with two different (but similar) datasets of Sentinel-1 RTC imagery: one produced by Alaska Satellite Facility and one produced by Microsoft Planetary Computer, shown below. Processing of SAR imagery can be very computationally intensive, both of these options leverage cloud-hosted computational resources to make processed SAR imagery available to users, reducing the need for individual users to perform complicated, resource and time-intensive processing.
55
+
Multiple algorithms perform radiometric terrain correction, and it is important to understand the components of whichever dataset you use and their relative benefits and tradeoffs. This book will demonstrate how to work with two different (but similar) datasets of Sentinel-1 RTC imagery: one produced by Alaska Satellite Facility and one produced by Microsoft Planetary Computer, shown below. Processing of SAR imagery can be very computationally intensive; both options leverage cloud-hosted computational resources to make processed SAR imagery available to users, reducing the need for individual users to perform complicated, resource and time-intensive processing.
0 commit comments