Skip to content

Commit ca7e907

Browse files
authored
Merge branch 'main' into aws-check
2 parents aaf884d + 8b00ea3 commit ca7e907

23 files changed

+1492
-1301
lines changed

.github/workflows/integration-test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ jobs:
2121
steps:
2222
- uses: actions/checkout@v4
2323
- name: Set up Python
24-
uses: actions/setup-python@v4
24+
uses: actions/setup-python@v5
2525
with:
2626
python-version: ${{ matrix.python-version }}
2727
- name: Get full python version

.github/workflows/static-analysis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ jobs:
99
- uses: actions/checkout@v4
1010

1111
- name: Install Python
12-
uses: actions/setup-python@v4
12+
uses: actions/setup-python@v5
1313
with:
1414
python-version: 3.x
1515

.github/workflows/test-mindeps.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ jobs:
2626
- name: Checkout source
2727
uses: actions/[email protected]
2828
- name: Setup Conda Environment
29-
uses: conda-incubator/setup-miniconda@v2.2.0
29+
uses: conda-incubator/setup-miniconda@v3.0.1
3030
with:
3131
miniforge-variant: Mambaforge
3232
miniforge-version: latest
@@ -46,4 +46,4 @@ jobs:
4646
run: bash scripts/test.sh
4747

4848
- name: Upload coverage
49-
uses: codecov/codecov-action@v1
49+
uses: codecov/codecov-action@v3

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616
steps:
1717
- uses: actions/checkout@v4
1818
- name: Set up Python
19-
uses: actions/setup-python@v4
19+
uses: actions/setup-python@v5
2020
with:
2121
python-version: ${{ matrix.python-version }}
2222
- name: Get full python version

CHANGELOG.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Changelog
22

3+
## [Unreleased]
4+
5+
* Bug fixes:
6+
* fixed #439 by implementing more trusted domains in the SessionWithRedirection
7+
* fixed #438 by using an authenticated session for hits()
8+
* Enhancements:
9+
* addressing #427 by adding parameters to collection query
10+
311
## [v0.8.2] 2023-12-06
412
* Bug fixes:
513
* Enable AWS check with IMDSv2
@@ -167,7 +175,7 @@
167175
- Add basic classes to interact with NASA CMR, EDL and cloud access.
168176
- Basic object formatting.
169177

170-
[Unreleased]: https://github.com/nsidc/earthaccess/compare/v0.5.2...HEAD
178+
[Unreleased]: https://github.com/nsidc/earthaccess/compare/v0.8.2...HEAD
171179
[v0.5.2]: https://github.com/nsidc/earthaccess/releases/tag/v0.5.2
172180
[v0.5.1]: https://github.com/nsidc/earthaccess/releases/tag/v0.5.1
173181
[v0.5.0]: https://github.com/nsidc/earthaccess/releases/tag/v0.4.0

README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -65,8 +65,6 @@ With *earthaccess* we can login, search and download data with a few lines of co
6565

6666
The only requirement to use this library is to open a free account with NASA [EDL](https://urs.earthdata.nasa.gov).
6767

68-
<a href="https://urs.earthdata.nasa.gov"><img src="https://auth.ops.maap-project.org/cas/images/urs-logo.png" /></a>
69-
7068

7169
### **Authentication**
7270

@@ -99,7 +97,7 @@ Once you are authenticated with NASA EDL you can:
9997
### **Searching for data**
10098

10199
Once we have selected our dataset we can search for the data granules using *doi*, *short_name* or *concept_id*.
102-
If we are not sure or we don't know how to search for a particular dataset, we can start with the ["Introducing NASA earthaccess"](https://nsidc.github.io/earthaccess/tutorials/demo/#querying-for-datasets) tutorial or through the [NASA Earthdata Search portal](https://search.earthdata.nasa.gov/). For a complete list of search parameters we can use visit the extended [API documentation](https://nsidc.github.io/earthaccess/user-reference/api/api/).
100+
If we are not sure or we don't know how to search for a particular dataset, we can start with the ["Introducing NASA earthaccess"](https://nsidc.github.io/earthaccess/tutorials/demo/#querying-for-datasets) tutorial or through the [NASA Earthdata Search portal](https://search.earthdata.nasa.gov/). For a complete list of search parameters we can use visit the extended [API documentation](https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/).
103101

104102
```python
105103

binder/environment-dev.yml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,22 @@ channels:
44
dependencies:
55
# This environment bootstraps poetry, the actual dev environment
66
# is installed and managed with poetry
7-
- python=3.9
7+
- python=3.10
88
- jupyterlab=3
99
- xarray>=0.19
1010
- ipyleaflet>=0.13
1111
- h5netcdf>=0.11
1212
- cartopy
13+
14+
- mkdocs>=1.2
15+
- mkdocs-material>=7.1,<9.0
16+
- markdown-include>=0.6
17+
- mkdocstrings>=0.19.0
18+
- mkdocstrings-python
19+
- mkdocs-jupyter>=0.19.0
20+
- pymdown-extensions>=9.2
21+
1322
- pip
1423
- pip:
1524
- poetry
25+
- markdown-callouts>=0.2.0

earthaccess/api.py

Lines changed: 68 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,8 @@
1313
from .utils import _validation as validate
1414

1515

16-
def _normalize_location(location: Union[str, None]) -> Union[str, None]:
17-
"""Handle user-provided `daac` and `provider` values
16+
def _normalize_location(location: Optional[str]) -> Optional[str]:
17+
"""Handle user-provided `daac` and `provider` values.
1818
1919
These values must have a capital letter as the first character
2020
followed by capital letters, numbers, or an underscore. Here we
@@ -31,32 +31,29 @@ def _normalize_location(location: Union[str, None]) -> Union[str, None]:
3131
def search_datasets(
3232
count: int = -1, **kwargs: Any
3333
) -> List[earthaccess.results.DataCollection]:
34-
"""Search datasets using NASA's CMR
34+
"""Search datasets using NASA's CMR.
3535
3636
[https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html)
3737
3838
Parameters:
39+
count: Number of records to get, -1 = all
40+
kwargs (Dict):
41+
arguments to CMR:
3942
40-
count (Integer): Number of records to get, -1 = all
41-
kwargs (Dict): arguments to CMR:
42-
43-
* **keyword**: case insensitive and support wild cards ? and *,
44-
43+
* **keyword**: case-insensitive and supports wildcards ? and *
4544
* **short_name**: e.g. ATL08
46-
4745
* **doi**: DOI for a dataset
48-
4946
* **daac**: e.g. NSIDC or PODAAC
50-
5147
* **provider**: particular to each DAAC, e.g. POCLOUD, LPDAAC etc.
48+
* **temporal**: a tuple representing temporal bounds in the form
49+
`("yyyy-mm-dd", "yyyy-mm-dd")`
50+
* **bounding_box**: a tuple representing spatial bounds in the form
51+
`(lower_left_lon, lower_left_lat, upper_right_lon, upper_right_lat)`
5252
53-
* **temporal**: ("yyyy-mm-dd", "yyyy-mm-dd")
54-
55-
* **bounding_box**: (lower_left_lon, lower_left_lat ,
56-
upper_right_lon, upper_right_lat)
5753
Returns:
58-
an list of DataCollection results that can be used to get
59-
information such as concept_id, doi, etc. about a dataset.
54+
A list of DataCollection results that can be used to get information about a
55+
dataset, e.g. concept_id, doi, etc.
56+
6057
Examples:
6158
```python
6259
datasets = earthaccess.search_datasets(
@@ -89,27 +86,24 @@ def search_data(
8986
[https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html)
9087
9188
Parameters:
89+
count: Number of records to get, -1 = all
90+
kwargs (Dict):
91+
arguments to CMR:
9292
93-
count (Integer): Number of records to get, -1 = all
94-
kwargs (Dict): arguments to CMR:
95-
96-
* **short_name**: dataset short name e.g. ATL08
97-
93+
* **short_name**: dataset short name, e.g. ATL08
9894
* **version**: dataset version
99-
10095
* **doi**: DOI for a dataset
101-
10296
* **daac**: e.g. NSIDC or PODAAC
103-
10497
* **provider**: particular to each DAAC, e.g. POCLOUD, LPDAAC etc.
98+
* **temporal**: a tuple representing temporal bounds in the form
99+
`("yyyy-mm-dd", "yyyy-mm-dd")`
100+
* **bounding_box**: a tuple representing spatial bounds in the form
101+
`(lower_left_lon, lower_left_lat, upper_right_lon, upper_right_lat)`
105102
106-
* **temporal**: ("yyyy-mm-dd", "yyyy-mm-dd")
107-
108-
* **bounding_box**: (lower_left_lon, lower_left_lat ,
109-
upper_right_lon, upper_right_lat)
110103
Returns:
111-
Granules: a list of DataGranules that can be used to access
112-
the granule files by using `download()` or `open()`.
104+
a list of DataGranules that can be used to access the granule files by using
105+
`download()` or `open()`.
106+
113107
Examples:
114108
```python
115109
datasets = earthaccess.search_data(
@@ -131,22 +125,20 @@ def search_data(
131125

132126

133127
def login(strategy: str = "all", persist: bool = False) -> Auth:
134-
"""Authenticate with Earthdata login (https://urs.earthdata.nasa.gov/)
128+
"""Authenticate with Earthdata login (https://urs.earthdata.nasa.gov/).
135129
136130
Parameters:
131+
strategy:
132+
An authentication method.
137133
138-
strategy (String): authentication method.
139-
140-
"all": (default) try all methods until one works
134+
* **"all"**: (default) try all methods until one works
135+
* **"interactive"**: enter username and password.
136+
* **"netrc"**: retrieve username and password from ~/.netrc.
137+
* **"environment"**: retrieve username and password from `$EARTHDATA_USERNAME` and `$EARTHDATA_PASSWORD`.
138+
persist: will persist credentials in a .netrc file
141139
142-
"interactive": enter username and password.
143-
144-
"netrc": retrieve username and password from ~/.netrc.
145-
146-
"environment": retrieve username and password from $EARTHDATA_USERNAME and $EARTHDATA_PASSWORD.
147-
persist (Boolean): will persist credentials in a .netrc file
148140
Returns:
149-
an instance of Auth.
141+
An instance of Auth.
150142
"""
151143
if strategy == "all":
152144
for strategy in ["environment", "netrc", "interactive"]:
@@ -168,19 +160,20 @@ def login(strategy: str = "all", persist: bool = False) -> Auth:
168160

169161
def download(
170162
granules: Union[DataGranule, List[DataGranule], str, List[str]],
171-
local_path: Union[str, None],
163+
local_path: Optional[str],
172164
provider: Optional[str] = None,
173165
threads: int = 8,
174166
) -> List[str]:
175167
"""Retrieves data granules from a remote storage system.
176168
177-
* If we run this in the cloud, we will be using S3 to move data to `local_path`
178-
* If we run it outside AWS (us-west-2 region) and the dataset is cloud hostes we'll use HTTP links
169+
* If we run this in the cloud, we will be using S3 to move data to `local_path`.
170+
* If we run it outside AWS (us-west-2 region) and the dataset is cloud hosted,
171+
we'll use HTTP links.
179172
180173
Parameters:
181174
granules: a granule, list of granules, a granule link (HTTP), or a list of granule links (HTTP)
182175
local_path: local directory to store the remote data granules
183-
provider: if we download a list of URLs we need to specify the provider.
176+
provider: if we download a list of URLs, we need to specify the provider.
184177
threads: parallel number of threads to use to download the files, adjust as necessary, default = 8
185178
186179
Returns:
@@ -208,8 +201,10 @@ def open(
208201
hosted on S3 or HTTPS by third party libraries like xarray.
209202
210203
Parameters:
211-
granules: a list of granule instances **or** list of URLs, e.g. s3://some-granule,
212-
if a list of URLs is passed we need to specify the data provider e.g. POCLOUD, NSIDC_CPRD etc.
204+
granules: a list of granule instances **or** list of URLs, e.g. `s3://some-granule`.
205+
If a list of URLs is passed, we need to specify the data provider.
206+
provider: e.g. POCLOUD, NSIDC_CPRD, etc.
207+
213208
Returns:
214209
a list of s3fs "file pointers" to s3 files.
215210
"""
@@ -223,15 +218,16 @@ def get_s3_credentials(
223218
provider: Optional[str] = None,
224219
results: Optional[List[earthaccess.results.DataGranule]] = None,
225220
) -> Dict[str, Any]:
226-
"""Returns temporary (1 hour) credentials for direct access to NASA S3 buckets, we can
227-
use the daac name, the provider or a list of results from earthaccess.search_data()
228-
if we use results earthaccess will use the metadata on the response to get the credentials,
229-
this is useful for missions that do not use the same endpoint as their DAACs e.g. SWOT
221+
"""Returns temporary (1 hour) credentials for direct access to NASA S3 buckets. We can
222+
use the daac name, the provider, or a list of results from earthaccess.search_data().
223+
If we use results, earthaccess will use the metadata on the response to get the credentials,
224+
which is useful for missions that do not use the same endpoint as their DAACs, e.g. SWOT.
230225
231226
Parameters:
232-
daac (String): a DAAC short_name like NSIDC or PODAAC etc
233-
provider (String: if we know the provider for the DAAC e.g. POCLOUD, LPCLOUD etc.
234-
results (list[earthaccess.results.DataGranule]): List of results from search_data()
227+
daac: a DAAC short_name like NSIDC or PODAAC, etc.
228+
provider: if we know the provider for the DAAC, e.g. POCLOUD, LPCLOUD etc.
229+
results: List of results from search_data()
230+
235231
Returns:
236232
a dictionary with S3 credentials for the DAAC or provider
237233
"""
@@ -244,12 +240,10 @@ def get_s3_credentials(
244240

245241

246242
def collection_query() -> Type[CollectionQuery]:
247-
"""Returns a query builder instance for NASA collections (datasets)
243+
"""Returns a query builder instance for NASA collections (datasets).
248244
249-
Parameters:
250-
cloud_hosted (Boolean): initializes the query builder for cloud hosted collections.
251245
Returns:
252-
class earthaccess.DataCollections: a query builder instance for data collections.
246+
a query builder instance for data collections.
253247
"""
254248
if earthaccess.__auth__.authenticated:
255249
query_builder = DataCollections(earthaccess.__auth__)
@@ -261,11 +255,8 @@ class earthaccess.DataCollections: a query builder instance for data collections
261255
def granule_query() -> Type[GranuleQuery]:
262256
"""Returns a query builder instance for data granules
263257
264-
Parameters:
265-
cloud_hosted (Boolean): initializes the query builder for a particular DOI
266-
if we have it.
267258
Returns:
268-
class earthaccess.DataGranules: a query builder instance for data granules.
259+
a query builder instance for data granules.
269260
"""
270261
if earthaccess.__auth__.authenticated:
271262
query_builder = DataGranules(earthaccess.__auth__)
@@ -275,10 +266,10 @@ class earthaccess.DataGranules: a query builder instance for data granules.
275266

276267

277268
def get_fsspec_https_session() -> AbstractFileSystem:
278-
"""Returns a fsspec session that can be used to access datafiles across many different DAACs
269+
"""Returns a fsspec session that can be used to access datafiles across many different DAACs.
279270
280271
Returns:
281-
class AbstractFileSystem: an fsspec instance able to access data across DAACs
272+
An fsspec instance able to access data across DAACs.
282273
283274
Examples:
284275
```python
@@ -289,19 +280,18 @@ class AbstractFileSystem: an fsspec instance able to access data across DAACs
289280
with fs.open(DAAC_GRANULE) as f:
290281
f.read(10)
291282
```
292-
293283
"""
294284
session = earthaccess.__store__.get_fsspec_session()
295285
return session
296286

297287

298288
def get_requests_https_session() -> requests.Session:
299-
"""Returns a requests Session instance with an authorized bearer token
300-
this is useful to make requests to restricted URLs like data granules or services that
289+
"""Returns a requests Session instance with an authorized bearer token.
290+
This is useful for making requests to restricted URLs, such as data granules or services that
301291
require authentication with NASA EDL.
302292
303293
Returns:
304-
class requests.Session: an authenticated requests Session instance.
294+
An authenticated requests Session instance.
305295
306296
Examples:
307297
```python
@@ -323,15 +313,17 @@ def get_s3fs_session(
323313
provider: Optional[str] = None,
324314
results: Optional[earthaccess.results.DataGranule] = None,
325315
) -> s3fs.S3FileSystem:
326-
"""Returns a fsspec s3fs file session for direct access when we are in us-west-2
316+
"""Returns a fsspec s3fs file session for direct access when we are in us-west-2.
327317
328318
Parameters:
329-
daac (String): Any DAAC short name e.g. NSIDC, GES_DISC
330-
provider (String): Each DAAC can have a cloud provider, if the DAAC is specified, there is no need to use provider
331-
results (list[class earthaccess.results.DataGranule]): A list of results from search_data(), earthaccess will use the metadata form CMR to obtain the S3 Endpoint
319+
daac: Any DAAC short name e.g. NSIDC, GES_DISC
320+
provider: Each DAAC can have a cloud provider.
321+
If the DAAC is specified, there is no need to use provider.
322+
results: A list of results from search_data().
323+
`earthaccess` will use the metadata from CMR to obtain the S3 Endpoint.
332324
333325
Returns:
334-
class s3fs.S3FileSystem: an authenticated s3fs session valid for 1 hour
326+
An authenticated s3fs session valid for 1 hour.
335327
"""
336328
daac = _normalize_location(daac)
337329
provider = _normalize_location(provider)
@@ -345,11 +337,10 @@ class s3fs.S3FileSystem: an authenticated s3fs session valid for 1 hour
345337

346338

347339
def get_edl_token() -> str:
348-
"""Returns the current token used for EDL
340+
"""Returns the current token used for EDL.
349341
350342
Returns:
351-
str: EDL token
352-
343+
EDL token
353344
"""
354345
token = earthaccess.__auth__.token
355346
return token

0 commit comments

Comments
 (0)