Skip to content

Commit fda1f29

Browse files
YuriZmytrakovYuri Zmytrakov
authored andcommitted
Merge branch 'main' into CAT-1382
2 parents 2a8dd32 + 0988448 commit fda1f29

File tree

15 files changed

+550
-82
lines changed

15 files changed

+550
-82
lines changed

.pre-commit-config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ repos:
3131
]
3232
additional_dependencies: [
3333
"types-attrs",
34-
"types-requests"
34+
"types-requests",
35+
"types-redis"
3536
]
3637
- repo: https://github.com/PyCQA/pydocstyle
3738
rev: 6.1.1

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,13 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
99

1010
### Added
1111

12+
- GET `/collections` collection search free text extension ex. `/collections?q=sentinel`. [#470](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/470)
1213
- Added `USE_DATETIME` environment variable to configure datetime search behavior in SFEOS. [#452](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/452)
1314
- GET `/collections` collection search sort extension ex. `/collections?sortby=+id`. [#456](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/456)
15+
- GET `/collections` collection search fields extension ex. `/collections?fields=id,title`. [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
16+
- Improved error messages for sorting on unsortable fields in collection search, including guidance on how to make fields sortable. [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
17+
- Added field alias for `temporal` to enable easier sorting by temporal extent, alongside `extent.temporal.interval`. [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
18+
- Added `ENABLE_COLLECTIONS_SEARCH` environment variable to make collection search extensions optional (defaults to enabled). [#465](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/465)
1419

1520
### Changed
1621

Makefile

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -63,22 +63,22 @@ docker-shell-os:
6363

6464
.PHONY: test-elasticsearch
6565
test-elasticsearch:
66-
-$(run_es) /bin/bash -c 'pip install redis==6.4.0 export && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest'
66+
-$(run_es) /bin/bash -c 'export && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest'
6767
docker compose down
6868

6969
.PHONY: test-opensearch
7070
test-opensearch:
71-
-$(run_os) /bin/bash -c 'pip install redis==6.4.0 export && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest'
71+
-$(run_os) /bin/bash -c 'export && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest'
7272
docker compose down
7373

7474
.PHONY: test-datetime-filtering-es
7575
test-datetime-filtering-es:
76-
-$(run_es) /bin/bash -c 'pip install redis==6.4.0 && export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
76+
-$(run_es) /bin/bash -c 'export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh elasticsearch:9200 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
7777
docker compose down
7878

7979
.PHONY: test-datetime-filtering-os
8080
test-datetime-filtering-os:
81-
-$(run_os) /bin/bash -c 'pip install redis==6.4.0 && export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
81+
-$(run_os) /bin/bash -c 'export ENABLE_DATETIME_INDEX_FILTERING=true && ./scripts/wait-for-it-es.sh opensearch:9202 && cd stac_fastapi/tests/ && pytest -s --cov=stac_fastapi --cov-report=term-missing -m datetime_filtering'
8282
docker compose down
8383

8484
.PHONY: test

README.md

Lines changed: 38 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,10 @@ SFEOS (stac-fastapi-elasticsearch-opensearch) is a high-performance, scalable AP
3636
- **Scale to millions of geospatial assets** with fast search performance through optimized spatial indexing and query capabilities
3737
- **Support OGC-compliant filtering** including spatial operations (intersects, contains, etc.) and temporal queries
3838
- **Perform geospatial aggregations** to analyze data distribution across space and time
39+
- **Enhanced collection search capabilities** with support for sorting and field selection
3940

4041
This implementation builds on the STAC-FastAPI framework, providing a production-ready solution specifically optimized for Elasticsearch and OpenSearch databases. It's ideal for organizations managing large geospatial data catalogs who need efficient discovery and access capabilities through standardized APIs.
4142

42-
43-
4443
## Common Deployment Patterns
4544

4645
stac-fastapi-elasticsearch-opensearch can be deployed in several ways depending on your needs:
@@ -72,6 +71,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
7271
- [Common Deployment Patterns](#common-deployment-patterns)
7372
- [Technologies](#technologies)
7473
- [Table of Contents](#table-of-contents)
74+
- [Collection Search Extensions](#collection-search-extensions)
7575
- [Documentation \& Resources](#documentation--resources)
7676
- [Package Structure](#package-structure)
7777
- [Examples](#examples)
@@ -113,6 +113,37 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
113113
- [Gitter Chat](https://app.gitter.im/#/room/#stac-fastapi-elasticsearch_community:gitter.im) - For real-time discussions
114114
- [GitHub Discussions](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/discussions) - For longer-form questions and answers
115115

116+
## Collection Search Extensions
117+
118+
SFEOS implements extended capabilities for the `/collections` endpoint, allowing for more powerful collection discovery:
119+
120+
- **Sorting**: Sort collections by sortable fields using the `sortby` parameter
121+
- Example: `/collections?sortby=+id` (ascending sort by ID)
122+
- Example: `/collections?sortby=-id` (descending sort by ID)
123+
- Example: `/collections?sortby=-temporal` (descending sort by temporal extent)
124+
125+
- **Field Selection**: Request only specific fields to be returned using the `fields` parameter
126+
- Example: `/collections?fields=id,title,description`
127+
- This helps reduce payload size when only certain fields are needed
128+
129+
- **Free Text Search**: Search across collection text fields using the `q` parameter
130+
- Example: `/collections?q=landsat`
131+
- Searches across multiple text fields including title, description, and keywords
132+
- Supports partial word matching and relevance-based sorting
133+
134+
These extensions make it easier to build user interfaces that display and navigate through collections efficiently.
135+
136+
> **Configuration**: Collection search extensions can be disabled by setting the `ENABLE_COLLECTIONS_SEARCH` environment variable to `false`. By default, these extensions are enabled.
137+
138+
> **Note**: Sorting is only available on fields that are indexed for sorting in Elasticsearch/OpenSearch. With the default mappings, you can sort on:
139+
> - `id` (keyword field)
140+
> - `extent.temporal.interval` (date field)
141+
> - `temporal` (alias to extent.temporal.interval)
142+
>
143+
> Text fields like `title` and `description` are not sortable by default as they use text analysis for better search capabilities. Attempting to sort on these fields will result in a user-friendly error message explaining which fields are sortable and how to make additional fields sortable by updating the mappings.
144+
>
145+
> **Important**: Adding keyword fields to make text fields sortable can significantly increase the index size, especially for large text fields. Consider the storage implications when deciding which fields to make sortable.
146+
116147
## Package Structure
117148

118149
This project is organized into several packages, each with a specific purpose:
@@ -243,6 +274,7 @@ You can customize additional settings in your `.env` file:
243274
| `ENABLE_DIRECT_RESPONSE` | Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation) | `false` | Optional |
244275
| `RAISE_ON_BULK_ERROR` | Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. **Note:** STAC Item and ItemCollection validation errors will always raise, regardless of this flag. | `false` | Optional |
245276
| `DATABASE_REFRESH` | Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible. | `false` | Optional |
277+
| `ENABLE_COLLECTIONS_SEARCH` | Enable collection search extensions (sort, fields). | `true` | Optional |
246278
| `ENABLE_TRANSACTIONS_EXTENSIONS` | Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented. | `true` | Optional |
247279
| `STAC_ITEM_LIMIT` | Sets the environment variable for result limiting to SFEOS for the number of returned items and STAC collections. | `10` | Optional |
248280
| `STAC_INDEX_ASSETS` | Controls if Assets are indexed when added to Elasticsearch/Opensearch. This allows asset fields to be included in search queries. | `false` | Optional |
@@ -389,6 +421,10 @@ The system uses a precise naming convention:
389421
- **Root Path Configuration**: The application root path is the base URL by default.
390422
- For AWS Lambda with Gateway API: Set `STAC_FASTAPI_ROOT_PATH` to match the Gateway API stage name (e.g., `/v1`)
391423

424+
- **Feature Configuration**: Control which features are enabled:
425+
- `ENABLE_COLLECTIONS_SEARCH`: Set to `true` (default) to enable collection search extensions (sort, fields). Set to `false` to disable.
426+
- `ENABLE_TRANSACTIONS_EXTENSIONS`: Set to `true` (default) to enable transaction extensions. Set to `false` to disable.
427+
392428

393429
## Collection Pagination
394430

docker-compose.redis.yml

Lines changed: 0 additions & 27 deletions
This file was deleted.

dockerfiles/Dockerfile.dev.es

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,4 @@ COPY . /app
1818
RUN pip install --no-cache-dir -e ./stac_fastapi/core
1919
RUN pip install --no-cache-dir -e ./stac_fastapi/sfeos_helpers
2020
RUN pip install --no-cache-dir -e ./stac_fastapi/elasticsearch[dev,server]
21+
RUN pip install --no-cache-dir redis types-redis

mypy.ini

Lines changed: 0 additions & 3 deletions
This file was deleted.

stac_fastapi/core/stac_fastapi/core/core.py

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -230,11 +230,18 @@ async def landing_page(self, **kwargs) -> stac_types.LandingPage:
230230
return landing_page
231231

232232
async def all_collections(
233-
self, sortby: Optional[str] = None, **kwargs
233+
self,
234+
fields: Optional[List[str]] = None,
235+
sortby: Optional[str] = None,
236+
q: Optional[Union[str, List[str]]] = None,
237+
**kwargs,
234238
) -> stac_types.Collections:
235239
"""Read all collections from the database.
236240
237241
Args:
242+
fields (Optional[List[str]]): Fields to include or exclude from the results.
243+
sortby (Optional[str]): Sorting options for the results.
244+
q (Optional[List[str]]): Free text search terms.
238245
**kwargs: Keyword arguments from the request.
239246
240247
Returns:
@@ -245,6 +252,15 @@ async def all_collections(
245252
limit = int(request.query_params.get("limit", os.getenv("STAC_ITEM_LIMIT", 10)))
246253
token = request.query_params.get("token")
247254

255+
# Process fields parameter for filtering collection properties
256+
includes, excludes = set(), set()
257+
if fields and self.extension_is_enabled("FieldsExtension"):
258+
for field in fields:
259+
if field[0] == "-":
260+
excludes.add(field[1:])
261+
else:
262+
includes.add(field[1:] if field[0] in "+ " else field)
263+
248264
sort = None
249265
if sortby:
250266
parsed_sort = []
@@ -267,10 +283,24 @@ async def all_collections(
267283
except Exception:
268284
redis = None
269285

286+
# Convert q to a list if it's a string
287+
q_list = None
288+
if q is not None:
289+
q_list = [q] if isinstance(q, str) else q
290+
270291
collections, next_token = await self.database.get_all_collections(
271-
token=token, limit=limit, request=request, sort=sort
292+
token=token, limit=limit, request=request, sort=sort, q=q_list
272293
)
273294

295+
# Apply field filtering if fields parameter was provided
296+
if fields and self.extension_is_enabled("FieldsExtension"):
297+
filtered_collections = [
298+
filter_fields(collection, includes, excludes)
299+
for collection in collections
300+
]
301+
else:
302+
filtered_collections = collections
303+
274304
links = [
275305
{"rel": Relations.root.value, "type": MimeTypes.json, "href": base_url},
276306
{"rel": Relations.parent.value, "type": MimeTypes.json, "href": base_url},
@@ -301,7 +331,7 @@ async def all_collections(
301331
next_link = PagingLinks(next=next_token, request=request).link_next()
302332
links.append(next_link)
303333

304-
return stac_types.Collections(collections=collections, links=links)
334+
return stac_types.Collections(collections=filtered_collections, links=links)
305335

306336
async def get_collection(
307337
self, collection_id: str, **kwargs

stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/app.py

Lines changed: 26 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,7 @@
4545
)
4646
from stac_fastapi.extensions.core.fields import FieldsConformanceClasses
4747
from stac_fastapi.extensions.core.filter import FilterConformanceClasses
48-
49-
# from stac_fastapi.extensions.core.free_text import FreeTextConformanceClasses
48+
from stac_fastapi.extensions.core.free_text import FreeTextConformanceClasses
5049
from stac_fastapi.extensions.core.query import QueryConformanceClasses
5150
from stac_fastapi.extensions.core.sort import SortConformanceClasses
5251
from stac_fastapi.extensions.third_party import BulkTransactionExtension
@@ -57,7 +56,9 @@
5756
logger = logging.getLogger(__name__)
5857

5958
TRANSACTIONS_EXTENSIONS = get_bool_env("ENABLE_TRANSACTIONS_EXTENSIONS", default=True)
59+
ENABLE_COLLECTIONS_SEARCH = get_bool_env("ENABLE_COLLECTIONS_SEARCH", default=True)
6060
logger.info("TRANSACTIONS_EXTENSIONS is set to %s", TRANSACTIONS_EXTENSIONS)
61+
logger.info("ENABLE_COLLECTIONS_SEARCH is set to %s", ENABLE_COLLECTIONS_SEARCH)
6162

6263
settings = ElasticsearchSettings()
6364
session = Session.create_from_settings(settings)
@@ -115,25 +116,26 @@
115116

116117
extensions = [aggregation_extension] + search_extensions
117118

118-
# Create collection search extensions
119-
# Only sort extension is enabled for now
120-
collection_search_extensions = [
121-
# QueryExtension(conformance_classes=[QueryConformanceClasses.COLLECTIONS]),
122-
SortExtension(conformance_classes=[SortConformanceClasses.COLLECTIONS]),
123-
# FieldsExtension(conformance_classes=[FieldsConformanceClasses.COLLECTIONS]),
124-
# CollectionSearchFilterExtension(
125-
# conformance_classes=[FilterConformanceClasses.COLLECTIONS]
126-
# ),
127-
# FreeTextExtension(conformance_classes=[FreeTextConformanceClasses.COLLECTIONS]),
128-
]
129-
130-
# Initialize collection search with its extensions
131-
collection_search_ext = CollectionSearchExtension.from_extensions(
132-
collection_search_extensions
133-
)
134-
collections_get_request_model = collection_search_ext.GET
119+
# Create collection search extensions if enabled
120+
if ENABLE_COLLECTIONS_SEARCH:
121+
# Create collection search extensions
122+
collection_search_extensions = [
123+
# QueryExtension(conformance_classes=[QueryConformanceClasses.COLLECTIONS]),
124+
SortExtension(conformance_classes=[SortConformanceClasses.COLLECTIONS]),
125+
FieldsExtension(conformance_classes=[FieldsConformanceClasses.COLLECTIONS]),
126+
# CollectionSearchFilterExtension(
127+
# conformance_classes=[FilterConformanceClasses.COLLECTIONS]
128+
# ),
129+
FreeTextExtension(conformance_classes=[FreeTextConformanceClasses.COLLECTIONS]),
130+
]
131+
132+
# Initialize collection search with its extensions
133+
collection_search_ext = CollectionSearchExtension.from_extensions(
134+
collection_search_extensions
135+
)
136+
collections_get_request_model = collection_search_ext.GET
135137

136-
extensions.append(collection_search_ext)
138+
extensions.append(collection_search_ext)
137139

138140
database_logic.extensions = [type(ext).__name__ for ext in extensions]
139141

@@ -170,10 +172,13 @@
170172
"search_get_request_model": create_get_request_model(search_extensions),
171173
"search_post_request_model": post_request_model,
172174
"items_get_request_model": items_get_request_model,
173-
"collections_get_request_model": collections_get_request_model,
174175
"route_dependencies": get_route_dependencies(),
175176
}
176177

178+
# Add collections_get_request_model if collection search is enabled
179+
if ENABLE_COLLECTIONS_SEARCH:
180+
app_config["collections_get_request_model"] = collections_get_request_model
181+
177182
api = StacApi(**app_config)
178183

179184

0 commit comments

Comments
 (0)