Skip to content

Commit a9ab143

Browse files
authored
Merge pull request #225 from tylere/simplify_proj_params
Simplify proj params
2 parents b1c6f13 + c215fac commit a9ab143

File tree

7 files changed

+818
-412
lines changed

7 files changed

+818
-412
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,4 +130,4 @@ cython_debug/
130130
.DS_Store
131131

132132
# pixi environments
133-
.pixi
133+
.pixi

README.md

Lines changed: 164 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
_An Xarray extension for Google Earth Engine._
66

7+
Xee bridges the gap between Google Earth Engine's massive data catalog and the scientific Python ecosystem. It provides a custom Xarray backend that allows you to open any `ee.ImageCollection` as if it were a local `xarray.Dataset`. Data is loaded lazily and in parallel, enabling you to work with petabyte-scale archives of satellite and climate data using the power and flexibility of Xarray and its integrations with libraries like Dask.
8+
79
[![image](https://img.shields.io/pypi/v/xee.svg)](https://pypi.python.org/pypi/xee)
810
[![image](https://static.pepy.tech/badge/xee)](https://pepy.tech/project/xee)
911
[![Conda
@@ -32,85 +34,206 @@ Then, authenticate Earth Engine:
3234
earthengine authenticate --quiet
3335
```
3436

35-
Now, in your Python environment, make the following imports:
37+
Now, in your Python environment, make the following imports and initialize the Earth Engine client with your project ID. Using the high-volume API endpoint is recommended.
3638

3739
```python
3840
import ee
39-
import xarray
41+
import xarray as xr
42+
from xee import helpers
43+
import shapely
44+
45+
ee.Initialize(
46+
project='PROJECT-ID', # Replace with your project ID
47+
opt_url='https://earthengine-highvolume.googleapis.com'
48+
)
4049
```
4150

42-
Next, specify your EE-registered cloud project ID and initialize the EE client
43-
with the high volume API:
51+
### Specifying the Output Grid
52+
53+
To open a dataset, you must specify the desired output pixel grid. The `xee.helpers` module simplifies this process by providing several convenient workflows, summarized below.
54+
55+
| Goal | Method | When to Use |
56+
| :--- | :--- | :--- |
57+
| **Match Source Grid** | Use `helpers.extract_grid_params()` to get the parameters from an EE object. | When you want the data in its original, default projection and scale. |
58+
| **Fit Area to a Shape** | Use `helpers.fit_geometry()` with the `geometry` and `grid_shape` arguments. | When you need a consistent output array size (e.g., for ML models) and the exact pixel size is less important. |
59+
| **Fit Area to a Scale** | Use `helpers.fit_geometry()` with the `geometry` and `grid_scale` arguments. | When the specific resolution (e.g., 30 meters, 0.01 degrees) is critical for your analysis. |
60+
| **Manual Override** | Pass `crs`, `crs_transform`, and `shape_2d` directly to `xr.open_dataset`. | For advanced cases where you already have an exact grid definition. |
61+
62+
> **Important Note on Units:** All grid parameter values must be in the units of the specified Coordinate Reference System (`crs`).
63+
> * For a geographic CRS like `'EPSG:4326'`, the units are in **degrees**.
64+
> * For a projected CRS like `'EPSG:32610'` (UTM), the units are in **meters**.
65+
> This applies to the translation values in `crs_transform` and the pixel sizes in `grid_scale`.
66+
67+
### Usage Examples
68+
69+
Here are common workflows for opening datasets with `xee`, corresponding to the methods in the table above.
70+
71+
#### Match Source Grid
72+
73+
This is the simplest case, using `helpers.extract_grid_params` to match the dataset's default grid.
4474

4575
```python
46-
ee.Initialize(
47-
project='my-project-id'
48-
opt_url='https://earthengine-highvolume.googleapis.com')
76+
ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR')
77+
grid_params = helpers.extract_grid_params(ic)
78+
ds = xr.open_dataset(ic, engine='ee', **grid_params)
4979
```
5080

51-
Open any Earth Engine ImageCollection by specifying the Xarray engine as `'ee'`:
81+
#### Fit Area to a Shape
82+
83+
Define a grid over an area of interest by specifying the number of pixels. `helpers.fit_geometry` will calculate the correct `crs_transform`.
5284

5385
```python
54-
ds = xarray.open_dataset('ee://ECMWF/ERA5_LAND/HOURLY', engine='ee')
86+
aoi = shapely.geometry.box(113.33, -43.63, 153.56, -10.66) # Australia
87+
grid_params = helpers.fit_geometry(
88+
geometry=aoi,
89+
grid_crs='EPSG:4326',
90+
grid_shape=(256, 256)
91+
)
92+
93+
ds = xr.open_dataset('ee://ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params)
5594
```
5695

57-
Open all bands in a specific projection (not the Xee default):
96+
#### Fit Area to a Scale (Resolution)
97+
98+
> **A Note on `grid_scale` and Y-Scale Orientation**
99+
> When using `fit_geometry` with `grid_scale`, you are defining both the pixel size and the grid's orientation via the sign of the y-scale.
100+
> * A **negative `y_scale`** (e.g., `(10000, -10000)`) is the standard for "north-up" satellite and aerial imagery, creating a grid with a **top-left** origin.
101+
> * A **positive `y_scale`** (e.g., `(10000, 10000)`) is used by some datasets and creates a grid with a **bottom-left** origin.
102+
> You may need to inspect your source dataset's projection information to determine the correct sign to use. If you use `grid_shape`, a standard negative y-scale is assumed.
103+
104+
The following example defines a grid over an area by specifying the pixel size in meters. `fit_geometry` will reproject the geometry and calculate the correct `shape_2d`.
58105

59106
```python
60-
ds = xarray.open_dataset('ee://ECMWF/ERA5_LAND/HOURLY', engine='ee',
61-
crs='EPSG:4326', scale=0.25)
107+
aoi = shapely.geometry.box(113.33, -43.63, 153.56, -10.66) # Australia
108+
grid_params = helpers.fit_geometry(
109+
geometry=aoi,
110+
geometry_crs='EPSG:4326', # CRS of the input geometry
111+
grid_crs='EPSG:32662', # Target CRS in meters (Plate Carrée)
112+
grid_scale=(10000, -10000) # Define a 10km pixel size
113+
)
114+
115+
ds = xr.open_dataset('ee://ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params)
62116
```
63117

64-
Open an ImageCollection (maybe, with EE-side filtering or processing):
118+
#### Open a Custom Region at Source Resolution
119+
120+
This workflow is ideal for analyzing a specific area while maintaining the dataset's original resolution.
65121

66122
```python
67-
ic = ee.ImageCollection('ECMWF/ERA5_LAND/HOURLY').filterDate(
68-
'1992-10-05', '1993-03-31')
69-
ds = xarray.open_dataset(ic, engine='ee', crs='EPSG:4326', scale=0.25)
123+
# 1. Get the original grid parameters from the target ImageCollection
124+
ic = ee.ImageCollection('ECMWF/ERA5_LAND/MONTHLY_AGGR')
125+
source_params = helpers.extract_grid_params(ic)
126+
127+
# 2. Extract the source CRS and scale
128+
source_crs = source_params['crs']
129+
source_transform = source_params['crs_transform']
130+
source_scale = (source_transform[0], source_transform[4]) # (x_scale, y_scale)
131+
132+
# 3. Use the source parameters to fit the grid to a specific geometry
133+
aoi = shapely.geometry.box(113.33, -43.63, 153.56, -10.66) # Australia
134+
final_grid_params = helpers.fit_geometry(
135+
geometry=aoi,
136+
geometry_crs='EPSG:4326',
137+
grid_crs=source_crs,
138+
grid_scale=source_scale
139+
)
140+
141+
# 4. Open the dataset with the final, combined parameters
142+
ds = xr.open_dataset(ic, engine='ee', **final_grid_params)
70143
```
71144

72-
Open an ImageCollection with a specific EE projection or geometry:
145+
#### Manual Override
146+
147+
For use cases where you know the exact grid parameters, you can provide them directly.
73148

74149
```python
75-
ic = ee.ImageCollection('ECMWF/ERA5_LAND/HOURLY').filterDate(
76-
'1992-10-05', '1993-03-31')
77-
leg1 = ee.Geometry.Rectangle(113.33, -43.63, 153.56, -10.66)
78-
ds = xarray.open_dataset(
79-
ic,
150+
# Manually define a 512x512 pixel grid with 1-degree pixels in EPSG:4326
151+
manual_crs = 'EPSG:4326'
152+
manual_transform = (0.1, 0, -180.05, 0, -0.1, 90.05) # Values are in degrees
153+
manual_shape = (512, 512)
154+
155+
ds = xr.open_dataset(
156+
'ee://ECMWF/ERA5_LAND/MONTHLY_AGGR',
80157
engine='ee',
81-
projection=ic.first().select(0).projection(),
82-
geometry=leg1
158+
crs=manual_crs,
159+
crs_transform=manual_transform,
160+
shape_2d=manual_shape,
83161
)
84162
```
85163

86-
Open multiple ImageCollections into one `xarray.Dataset`, all with the same
87-
projection:
164+
#### Open a Pre-Processed ImageCollection
165+
166+
A key feature of Xee is its ability to open a computed `ee.ImageCollection`. This allows you to leverage Earth Engine's powerful server-side processing for tasks like filtering, band selection, and calculations before loading the data into Xarray.
88167

89168
```python
90-
ds = xarray.open_mfdataset(
91-
['ee://ECMWF/ERA5_LAND/HOURLY', 'ee://NASA/GDDP-CMIP6'],
92-
engine='ee', crs='EPSG:4326', scale=0.25)
169+
# Define an AOI as a shapely object for the helper function
170+
sf_aoi_shapely = shapely.geometry.Point(-122.4, 37.7).buffer(0.2)
171+
# Create an ee.Geometry from the shapely object for server-side filtering
172+
coords = list(sf_aoi_shapely.exterior.coords)
173+
sf_aoi_ee = ee.Geometry.Polygon(coords)
174+
175+
# Define a function to calculate NDVI and add it as a band
176+
def add_ndvi(image):
177+
# Landsat 9 SR bands: NIR = B5, Red = B4
178+
ndvi = image.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI')
179+
return image.addBands(ndvi)
180+
181+
# Build the pre-processed collection
182+
processed_collection = (ee.ImageCollection('LANDSAT/LC09/C02/T1_L2')
183+
.filterDate('2024-06-01', '2024-09-01')
184+
.filterBounds(sf_aoi_ee)
185+
.map(add_ndvi)
186+
.select(['NDVI']))
187+
188+
# Define the output grid using a helper
189+
grid_params = helpers.fit_geometry(
190+
geometry=sf_aoi_shapely,
191+
grid_crs='EPSG:32610', # Target CRS in meters (UTM Zone 10N)
192+
grid_scale=(30, -30) # Use Landsat's 30m resolution
193+
)
194+
195+
# Open the fully processed collection
196+
ds = xr.open_dataset(processed_collection, engine='ee', **grid_params)
93197
```
94198

95-
Open a single Image by passing it to an ImageCollection:
199+
#### Open a single Image
200+
201+
The `helpers` work the same way for a single `ee.Image`.
96202

97203
```python
98-
i = ee.ImageCollection(ee.Image('LANDSAT/LC08/C02/T1_TOA/LC08_044034_20140318'))
99-
ds = xarray.open_dataset(i, engine='ee')
204+
img = ee.Image('ECMWF/ERA5_LAND/MONTHLY_AGGR/202501')
205+
grid_params = helpers.extract_grid_params(img)
206+
ds = xr.open_dataset(img, engine='ee', **grid_params)
207+
```
208+
209+
#### Visualize a Single Time Slice
210+
211+
Once you have your `xarray.Dataset`, you can visualize a single time slice of a variable to verify the results. This requires the `matplotlib` library, which is an optional dependency.
212+
213+
If you don't have it installed, you can add it with pip:
214+
215+
```shell
216+
pip install matplotlib
100217
```
101218

102-
Open any Earth Engine ImageCollection to match an existing transform:
219+
Xarray's plotting functions expect dimensions in `(y, x)` order for 2D plots. Since the data is in `(x, y)` order, we use `.transpose()` to swap the axes for correct visualization.
103220

104221
```python
105-
raster = rioxarray.open_rasterio(...) # assume crs + transform is set
106-
ds = xr.open_dataset(
107-
'ee://ECMWF/ERA5_LAND/HOURLY',
108-
engine='ee',
109-
geometry=tuple(raster.rio.bounds()), # must be in EPSG:4326
110-
projection=ee.Projection(
111-
crs=str(raster.rio.crs), transform=raster.rio.transform()[:6]
112-
),
222+
223+
# First, open a dataset using one of the methods above
224+
aoi = shapely.geometry.box(113.33, -43.63, 153.56, -10.66) # Australia
225+
grid_params = helpers.fit_geometry(
226+
geometry=aoi,
227+
grid_crs='EPSG:4326',
228+
grid_shape=(256, 256)
113229
)
230+
ds = xr.open_dataset('ECMWF/ERA5_LAND/MONTHLY_AGGR', engine='ee', **grid_params)
231+
232+
# Select the 2m air temperature for the first time step
233+
temp_slice = ds['temperature_2m'].isel(time=0)
234+
235+
# Transpose from (x, y) to (y, x) for correct plotting orientation and plot
236+
temp_slice.transpose('y', 'x').plot()
114237
```
115238

116239
See [examples](https://github.com/google/Xee/tree/main/examples) or

pyproject.toml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name = "xee"
33
dynamic = ["version"]
44
description = "A Google Earth Engine extension for Xarray."
55
readme = "README.md"
6-
requires-python = ">=3.8"
6+
requires-python = ">=3.8,<3.13"
77
license = {text = "Apache-2.0"}
88
authors = [
99
{name = "Google LLC", email = "noreply@google.com"},
@@ -28,6 +28,7 @@ dependencies = [
2828
"earthengine-api>=0.1.374",
2929
"pyproj",
3030
"affine",
31+
"shapely",
3132
]
3233

3334
[project.entry-points."xarray.backends"]
@@ -65,5 +66,8 @@ preview = true
6566
pyink-indentation = 2
6667
pyink-use-majority-quotes = true
6768

69+
[tool.setuptools]
70+
packages = ["xee"]
71+
6872
[tool.setuptools_scm]
6973
fallback_version = "9999"

0 commit comments

Comments
 (0)