Skip to content

Conversation

@hrodmn
Copy link
Collaborator

@hrodmn hrodmn commented Sep 26, 2024

Related Issue(s):

Description:
Collection discovery is a challenge in the current environment. A user might know which catalog or API they want data from but they do not know have the actual collection_id that they will need to perform an item-level search. The STAC API Collection Search Extension makes it possible for a user to search apply filters to collection-level metadata. This is most useful when the STAC API Free Text Extension is enabled because a user can search an API for all collections with a term like q=DEM to find all collections that have the term DEM in the title, description, or keywords.

Since most APIs do not currently have the collections earch extension enabled, I added some client-side filtering logic to make the CollectionSearch class request the full list of collections from the /collections endpoint then apply a limited set of filters (datetime, bbox, q) to the list.

  • Refactor ItemSearch class to inherit from a new BaseSearch class so methods can be shared between ItemSearch and CollectionSearch classes
  • Add CollectionSearch class
  • Add Client.collection_search method
  • Add collection search functionality to cli.py
  • Add new tests
    • CollectionSearch
    • Client.collection_search
    • test_cli.py

PR Checklist:

  • Code is formatted
  • Tests pass
  • Changes are added to the CHANGELOG

@hrodmn
Copy link
Collaborator Author

hrodmn commented Sep 26, 2024

I added the Client.collection_search method but now I wonder if it would make more sense to add the optional filter args to Client.collections instead since that would follow the pattern from the STAC API a bit more closely. When the collection search extension is enabled, you perform a collection search by adding query parameters like bbox and q to GET requests on the /collections endpoint.

@codecov-commenter
Copy link

codecov-commenter commented Sep 26, 2024

Codecov Report

Attention: Patch coverage is 91.21951% with 18 lines in your changes missing coverage. Please review.

Project coverage is 93.68%. Comparing base (21435b0) to head (07e5187).
Report is 81 commits behind head on main.

Files with missing lines Patch % Lines
pystac_client/collection_search.py 91.60% 11 Missing ⚠️
pystac_client/item_search.py 77.77% 4 Missing ⚠️
pystac_client/cli.py 86.95% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #735      +/-   ##
==========================================
+ Coverage   93.43%   93.68%   +0.25%     
==========================================
  Files          13       15       +2     
  Lines         990     1188     +198     
==========================================
+ Hits          925     1113     +188     
- Misses         65       75      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gadomski gadomski self-requested a review September 26, 2024 21:12
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I did an initial pass and left a few comments. I'll do a more thoughtful review later.

@hrodmn hrodmn force-pushed the collection-search branch from fa25492 to fee6c97 Compare October 8, 2024 18:36
@hrodmn
Copy link
Collaborator Author

hrodmn commented Oct 8, 2024

I think the last big thing to add here is a cli method for collection search. @gadomski what do you think about adding the search args to the collections method in the CLI? I could also add a collection-search method, but filter parameters to the existing collections method would fit naturally with the STAC API experience (e.g. /collections?q=sentinel).

@gadomski
Copy link
Member

gadomski commented Oct 8, 2024

what do you think about adding the search args to the collections method in the CLI?

Yup, makes sense to me!

@hrodmn hrodmn marked this pull request as ready for review October 9, 2024 15:50
@gadomski gadomski self-requested a review October 9, 2024 23:19
Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a look at the CI errors, looks like you'll need to use pytest.warns to catch-and-assert the client-side filtering warnings.

@hrodmn
Copy link
Collaborator Author

hrodmn commented Oct 10, 2024

Taking a look at the CI errors, looks like you'll need to use pytest.warns to catch-and-assert the client-side filtering warnings.

Argh, yeah. I need to start running scripts/test instead of pytest.

Thanks for the review, I'll get those changes in today!

@gadomski
Copy link
Member

I need to start running scripts/test instead of pytest.

or develop an allergic reaction to all warnings, like I have (don't recommend leads to lots of yak shaving) :-)

@hrodmn hrodmn force-pushed the collection-search branch from ee88cef to faa8d8c Compare October 10, 2024 11:19
@hrodmn hrodmn force-pushed the collection-search branch from 1457fbb to fce4bd0 Compare October 10, 2024 14:27
@hrodmn
Copy link
Collaborator Author

hrodmn commented Oct 10, 2024

@gadomski thanks for your reviews, sorry for not catching those little CI issues and for half-accepting your suggestion on matched!

Copy link
Member

@gadomski gadomski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only thing missing is a CHANGELOG entry. Thanks for the iterations @hrodmn!

@gadomski gadomski enabled auto-merge (squash) October 15, 2024 13:21
@gadomski gadomski merged commit 3fe2670 into stac-utils:main Oct 15, 2024
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet