Skip to content

Commit e15f45b

Browse files
Add field exclusion from queryables endpoint. (stac-utils#489)
**Description:** added `EXCLUDED_FROM_QUERYABLES` environment variable to allow excluding specific fields from the queryables endpoint and filtering, even if those are indexed. Supports comma-separated list of fully qualified field names (e.g., properties.auth:schemes,properties.storage:schemes). **PR Checklist:** - [x] Code is formatted and linted (run `pre-commit run --all-files`) - [x] Tests pass (run `make test`) - [x] Documentation has been updated to reflect changes, if applicable - [x] Changes are added to the changelog --------- Co-authored-by: Jonathan Healy <[email protected]>
1 parent dc20246 commit e15f45b

File tree

5 files changed

+131
-18
lines changed

5 files changed

+131
-18
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
99

1010
### Added
1111

12+
- CloudFerro logo to sponsors and supporters list [#485](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/485)
13+
- Latest news section to README [#485](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/485)
14+
- Environment variable `EXCLUDED_FROM_QUERYABLES` to exclude specific fields from queryables endpoint and filtering. Supports comma-separated list of fully qualified field names (e.g., `properties.auth:schemes,properties.storage:schemes`) [#489](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/489)
15+
1216
### Changed
1317

1418
### Fixed

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
102102
- [Using Pre-built Docker Images](#using-pre-built-docker-images)
103103
- [Using Docker Compose](#using-docker-compose)
104104
- [Configuration Reference](#configuration-reference)
105+
- [Excluding Fields from Queryables](#excluding-fields-from-queryables)
105106
- [Datetime-Based Index Management](#datetime-based-index-management)
106107
- [Overview](#overview)
107108
- [When to Use](#when-to-use)
@@ -337,10 +338,35 @@ You can customize additional settings in your `.env` file:
337338
| `STAC_DEFAULT_ITEM_LIMIT` | Configures the default number of STAC items returned when no limit parameter is specified in the request. | `10` | Optional |
338339
| `STAC_INDEX_ASSETS` | Controls if Assets are indexed when added to Elasticsearch/Opensearch. This allows asset fields to be included in search queries. | `false` | Optional |
339340
| `USE_DATETIME` | Configures the datetime search behavior in SFEOS. When enabled, searches both datetime field and falls back to start_datetime/end_datetime range for items with null datetime. When disabled, searches only by start_datetime/end_datetime range. | `true` | Optional |
341+
| `EXCLUDED_FROM_QUERYABLES` | Comma-separated list of fully qualified field names to exclude from the queryables endpoint and filtering. Use full paths like `properties.auth:schemes,properties.storage:schemes`. Excluded fields and their nested children will not be exposed in queryables. | None | Optional |
340342

341343
> [!NOTE]
342344
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
343345
346+
## Excluding Fields from Queryables
347+
348+
You can exclude specific fields from being exposed in the queryables endpoint and from filtering by setting the `EXCLUDED_FROM_QUERYABLES` environment variable. This is useful for hiding sensitive or internal fields that should not be queryable by API users.
349+
350+
**Environment Variable:**
351+
352+
```bash
353+
EXCLUDED_FROM_QUERYABLES="properties.auth:schemes,properties.storage:schemes,properties.internal:metadata"
354+
```
355+
356+
**Format:**
357+
358+
- Comma-separated list of fully qualified field names
359+
- Use the full path including the `properties.` prefix for item properties
360+
- Example field names:
361+
- `properties.auth:schemes`
362+
- `properties.storage:schemes`
363+
364+
**Behavior:**
365+
366+
- Excluded fields will not appear in the queryables response
367+
- Excluded fields and their nested children will be skipped during field traversal
368+
- Both the field itself and any nested properties will be excluded
369+
344370
## Datetime-Based Index Management
345371

346372
### Overview

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter/README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,12 @@ between the two implementations.
99
The filter package is organized into three main modules:
1010

1111
- **cql2.py**: Contains functions for converting CQL2 patterns to Elasticsearch/OpenSearch compatible formats
12+
1213
- [cql2_like_to_es](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:59:0-75:5): Converts CQL2 "LIKE" characters to Elasticsearch "wildcard" characters
13-
- [_replace_like_patterns](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:51:0-56:71): Helper function for pattern replacement
14+
- [\_replace_like_patterns](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:51:0-56:71): Helper function for pattern replacement
1415

1516
- **transform.py**: Contains functions for transforming CQL2 queries to Elasticsearch/OpenSearch query DSL
17+
1618
- [to_es_field](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:83:0-93:47): Maps field names using queryables mapping
1719
- [to_es](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:96:0-201:13): Transforms CQL2 query structures to Elasticsearch/OpenSearch query DSL
1820

@@ -24,4 +26,5 @@ The filter package is organized into three main modules:
2426
Import the necessary components from the filter package:
2527

2628
```python
27-
from stac_fastapi.sfeos_helpers.filter import cql2_like_to_es, to_es, EsAsyncBaseFiltersClient
29+
from stac_fastapi.sfeos_helpers.filter import cql2_like_to_es, to_es, EsAsyncBaseFiltersClient
30+
```

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter/client.py

Lines changed: 46 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
"""Filter client implementation for Elasticsearch/OpenSearch."""
22

3+
import os
34
from collections import deque
4-
from typing import Any, Dict, Optional, Tuple
5+
from typing import Any, Optional
56

67
import attr
78
from fastapi import Request
@@ -18,9 +19,29 @@ class EsAsyncBaseFiltersClient(AsyncBaseFiltersClient):
1819

1920
database: BaseDatabaseLogic = attr.ib()
2021

22+
@staticmethod
23+
def _get_excluded_from_queryables() -> set[str]:
24+
"""Get fields to exclude from queryables endpoint and filtering.
25+
26+
Reads from EXCLUDED_FROM_QUERYABLES environment variable.
27+
Supports comma-separated list of field names.
28+
29+
Example:
30+
EXCLUDED_FROM_QUERYABLES="auth:schemes,storage:schemes"
31+
32+
Returns:
33+
Set[str]: Set of field names to exclude from queryables
34+
"""
35+
excluded = os.getenv("EXCLUDED_FROM_QUERYABLES", "")
36+
if not excluded:
37+
return set()
38+
return {field.strip() for field in excluded.split(",") if field.strip()}
39+
2140
async def get_queryables(
22-
self, collection_id: Optional[str] = None, **kwargs
23-
) -> Dict[str, Any]:
41+
self,
42+
collection_id: Optional[str] = None, # noqa: UP045
43+
**kwargs: Any,
44+
) -> dict[str, Any]:
2445
"""Get the queryables available for the given collection_id.
2546
2647
If collection_id is None, returns the intersection of all
@@ -38,21 +59,23 @@ async def get_queryables(
3859
Returns:
3960
Dict[str, Any]: A dictionary containing the queryables for the given collection.
4061
"""
41-
request: Optional[Request] = kwargs.get("request")
42-
url_str: str = str(request.url) if request else ""
43-
queryables: Dict[str, Any] = {
62+
request: Optional[Request] = kwargs.get("request") # noqa: UP045
63+
url_str = str(request.url) if request else ""
64+
65+
queryables: dict[str, Any] = {
4466
"$schema": "https://json-schema.org/draft-07/schema",
45-
"$id": f"{url_str}",
67+
"$id": url_str,
4668
"type": "object",
4769
"title": "Queryables for STAC API",
4870
"description": "Queryable names for the STAC API Item Search filter.",
4971
"properties": DEFAULT_QUERYABLES,
5072
"additionalProperties": True,
5173
}
74+
5275
if not collection_id:
5376
return queryables
5477

55-
properties: Dict[str, Any] = queryables["properties"].copy()
78+
properties = queryables["properties"].copy()
5679
queryables.update(
5780
{
5881
"properties": properties,
@@ -62,8 +85,9 @@ async def get_queryables(
6285

6386
mapping_data = await self.database.get_items_mapping(collection_id)
6487
mapping_properties = next(iter(mapping_data.values()))["mappings"]["properties"]
65-
stack: deque[Tuple[str, Dict[str, Any]]] = deque(mapping_properties.items())
66-
enum_fields: Dict[str, Dict[str, Any]] = {}
88+
stack: deque[tuple[str, dict[str, Any]]] = deque(mapping_properties.items())
89+
enum_fields: dict[str, dict[str, Any]] = {}
90+
excluded_fields = self._get_excluded_from_queryables()
6791

6892
while stack:
6993
field_fqn, field_def = stack.popleft()
@@ -75,11 +99,16 @@ async def get_queryables(
7599
(f"{field_fqn}.{k}", v)
76100
for k, v in field_properties.items()
77101
if v.get("enabled", True)
102+
and f"{field_fqn}.{k}" not in excluded_fields
78103
)
79104

80105
# Skip non-indexed or disabled fields
81106
field_type = field_def.get("type")
82-
if not field_type or not field_def.get("enabled", True):
107+
if (
108+
not field_type
109+
or not field_def.get("enabled", True)
110+
or field_fqn in excluded_fields
111+
):
83112
continue
84113

85114
# Fields in Item Properties should be exposed with their un-prefixed names,
@@ -88,7 +117,7 @@ async def get_queryables(
88117
field_name = field_fqn.removeprefix("properties.")
89118

90119
# Generate field properties
91-
field_result = ALL_QUERYABLES.get(field_name, {})
120+
field_result = ALL_QUERYABLES.get(field_name, {}).copy()
92121
properties[field_name] = field_result
93122

94123
field_name_human = field_name.replace("_", " ").title()
@@ -104,9 +133,10 @@ async def get_queryables(
104133
enum_fields[field_fqn] = field_result
105134

106135
if enum_fields:
107-
for field_fqn, unique_values in (
108-
await self.database.get_items_unique_values(collection_id, enum_fields)
109-
).items():
110-
enum_fields[field_fqn]["enum"] = unique_values
136+
unique_values = await self.database.get_items_unique_values(
137+
collection_id, enum_fields
138+
)
139+
for field_fqn, values in unique_values.items():
140+
enum_fields[field_fqn]["enum"] = values
111141

112142
return queryables

stac_fastapi/tests/extensions/test_filter.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -674,3 +674,53 @@ async def test_queryables_enum_platform(
674674
# Clean up
675675
r = await app_client.delete(f"/collections/{collection_id}")
676676
r.raise_for_status()
677+
678+
679+
@pytest.mark.asyncio
680+
async def test_queryables_excluded_fields(
681+
app_client: AsyncClient,
682+
load_test_data: Callable[[str], Dict],
683+
monkeypatch: pytest.MonkeyPatch,
684+
):
685+
"""Test that fields can be excluded from queryables using EXCLUDED_FROM_QUERYABLES."""
686+
# Arrange
687+
monkeypatch.setenv("DATABASE_REFRESH", "true")
688+
monkeypatch.setenv(
689+
"EXCLUDED_FROM_QUERYABLES", "properties.platform,properties.instrument"
690+
)
691+
692+
# Create collection
693+
collection_data = load_test_data("test_collection.json")
694+
collection_id = collection_data["id"] = f"exclude-test-collection-{uuid.uuid4()}"
695+
r = await app_client.post("/collections", json=collection_data)
696+
r.raise_for_status()
697+
698+
# Create an item
699+
item_data = load_test_data("test_item.json")
700+
item_data["id"] = "exclude-test-item"
701+
item_data["collection"] = collection_id
702+
item_data["properties"]["platform"] = "landsat-8"
703+
item_data["properties"]["instrument"] = "OLI_TIRS"
704+
r = await app_client.post(f"/collections/{collection_id}/items", json=item_data)
705+
r.raise_for_status()
706+
707+
# Act
708+
queryables = (
709+
(await app_client.get(f"/collections/{collection_id}/queryables"))
710+
.raise_for_status()
711+
.json()
712+
)
713+
714+
# Assert
715+
# Excluded fields should NOT be in queryables
716+
properties = queryables["properties"]
717+
assert "platform" not in properties
718+
assert "instrument" not in properties
719+
720+
# Other fields should still be present
721+
assert "datetime" in properties
722+
assert "gsd" in properties
723+
724+
# Clean up
725+
r = await app_client.delete(f"/collections/{collection_id}")
726+
r.raise_for_status()

0 commit comments

Comments
 (0)