Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

- CloudFerro logo to sponsors and supporters list [#485](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/485)
- Latest news section to README [#485](https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch/pull/485)
- Environment variable `EXCLUDED_FROM_QUERYABLES` to exclude specific fields from queryables endpoint and filtering. Supports comma-separated list of fully qualified field names (e.g., `properties.auth:schemes,properties.storage:schemes`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here if you can add the link with the pr number

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it when i fixed the conflict in the changelog


### Changed

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,7 @@ You can customize additional settings in your `.env` file:
| `STAC_INDEX_ASSETS` | Controls if Assets are indexed when added to Elasticsearch/Opensearch. This allows asset fields to be included in search queries. | `false` | Optional |
| `ENV_MAX_LIMIT` | Configures the environment variable in SFEOS to override the default `MAX_LIMIT`, which controls the limit parameter for returned items and STAC collections. | `10,000` | Optional |
| `USE_DATETIME` | Configures the datetime search behavior in SFEOS. When enabled, searches both datetime field and falls back to start_datetime/end_datetime range for items with null datetime. When disabled, searches only by start_datetime/end_datetime range. | `true` | Optional |
| `EXCLUDED_FROM_QUERYABLES` | Comma-separated list of fully qualified field names to exclude from the queryables endpoint and filtering. Use full paths like `properties.auth:schemes,properties.storage:schemes`. Excluded fields and their nested children will not be exposed in queryables. | None | Optional |

> [!NOTE]
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,48 @@ between the two implementations.
The filter package is organized into three main modules:

- **cql2.py**: Contains functions for converting CQL2 patterns to Elasticsearch/OpenSearch compatible formats

- [cql2_like_to_es](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:59:0-75:5): Converts CQL2 "LIKE" characters to Elasticsearch "wildcard" characters
- [_replace_like_patterns](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:51:0-56:71): Helper function for pattern replacement
- [\_replace_like_patterns](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:51:0-56:71): Helper function for pattern replacement

- **transform.py**: Contains functions for transforming CQL2 queries to Elasticsearch/OpenSearch query DSL

- [to_es_field](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:83:0-93:47): Maps field names using queryables mapping
- [to_es](cci:1://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:96:0-201:13): Transforms CQL2 query structures to Elasticsearch/OpenSearch query DSL

- **client.py**: Contains the base filter client implementation
- [EsAsyncBaseFiltersClient](cci:2://file:///home/computer/Code/stac-fastapi-elasticsearch-opensearch/stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/filter.py:209:0-293:25): Base class for implementing the STAC filter extension

## Configuration

### Excluding Fields from Queryables

You can exclude specific fields from being exposed in the queryables endpoint and from filtering by setting the `EXCLUDED_FROM_QUERYABLES` environment variable. This is useful for hiding sensitive or internal fields that should not be queryable by API users.

**Environment Variable:**

```bash
EXCLUDED_FROM_QUERYABLES="properties.auth:schemes,properties.storage:schemes,properties.internal:metadata"
```

**Format:**

- Comma-separated list of fully qualified field names
- Use the full path including the `properties.` prefix for item properties
- Example field names:
- `properties.auth:schemes`
- `properties.storage:schemes`

**Behavior:**

- Excluded fields will not appear in the queryables response
- Excluded fields and their nested children will be skipped during field traversal
- Both the field itself and any nested properties will be excluded

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this section to the main Readme and add a link to the section in the Table of Contents?

## Usage

Import the necessary components from the filter package:

```python
from stac_fastapi.sfeos_helpers.filter import cql2_like_to_es, to_es, EsAsyncBaseFiltersClient
from stac_fastapi.sfeos_helpers.filter import cql2_like_to_es, to_es, EsAsyncBaseFiltersClient
```
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""Filter client implementation for Elasticsearch/OpenSearch."""

import os
from collections import deque
from typing import Any, Dict, Optional, Tuple
from typing import Any, Optional

import attr
from fastapi import Request
Expand All @@ -18,9 +19,29 @@ class EsAsyncBaseFiltersClient(AsyncBaseFiltersClient):

database: BaseDatabaseLogic = attr.ib()

@staticmethod
def _get_excluded_from_queryables() -> set[str]:
"""Get fields to exclude from queryables endpoint and filtering.

Reads from EXCLUDED_FROM_QUERYABLES environment variable.
Supports comma-separated list of field names.

Example:
EXCLUDED_FROM_QUERYABLES="auth:schemes,storage:schemes"

Returns:
Set[str]: Set of field names to exclude from queryables
"""
excluded = os.getenv("EXCLUDED_FROM_QUERYABLES", "")
if not excluded:
return set()
return {field.strip() for field in excluded.split(",") if field.strip()}

async def get_queryables(
self, collection_id: Optional[str] = None, **kwargs
) -> Dict[str, Any]:
self,
collection_id: str | None = None,
**kwargs: Any,
) -> dict[str, Any]:
"""Get the queryables available for the given collection_id.

If collection_id is None, returns the intersection of all
Expand All @@ -38,21 +59,23 @@ async def get_queryables(
Returns:
Dict[str, Any]: A dictionary containing the queryables for the given collection.
"""
request: Optional[Request] = kwargs.get("request")
url_str: str = str(request.url) if request else ""
queryables: Dict[str, Any] = {
request: Optional[Request] = kwargs.get("request") # noqa: UP045
url_str = str(request.url) if request else ""

queryables: dict[str, Any] = {
"$schema": "https://json-schema.org/draft-07/schema",
"$id": f"{url_str}",
"$id": url_str,
"type": "object",
"title": "Queryables for STAC API",
"description": "Queryable names for the STAC API Item Search filter.",
"properties": DEFAULT_QUERYABLES,
"additionalProperties": True,
}

if not collection_id:
return queryables

properties: Dict[str, Any] = queryables["properties"].copy()
properties = queryables["properties"].copy()
queryables.update(
{
"properties": properties,
Expand All @@ -62,8 +85,9 @@ async def get_queryables(

mapping_data = await self.database.get_items_mapping(collection_id)
mapping_properties = next(iter(mapping_data.values()))["mappings"]["properties"]
stack: deque[Tuple[str, Dict[str, Any]]] = deque(mapping_properties.items())
enum_fields: Dict[str, Dict[str, Any]] = {}
stack: deque[tuple[str, dict[str, Any]]] = deque(mapping_properties.items())
enum_fields: dict[str, dict[str, Any]] = {}
excluded_fields = self._get_excluded_from_queryables()

while stack:
field_fqn, field_def = stack.popleft()
Expand All @@ -75,11 +99,16 @@ async def get_queryables(
(f"{field_fqn}.{k}", v)
for k, v in field_properties.items()
if v.get("enabled", True)
and f"{field_fqn}.{k}" not in excluded_fields
)

# Skip non-indexed or disabled fields
field_type = field_def.get("type")
if not field_type or not field_def.get("enabled", True):
if (
not field_type
or not field_def.get("enabled", True)
or field_fqn in excluded_fields
):
continue

# Fields in Item Properties should be exposed with their un-prefixed names,
Expand All @@ -88,7 +117,7 @@ async def get_queryables(
field_name = field_fqn.removeprefix("properties.")

# Generate field properties
field_result = ALL_QUERYABLES.get(field_name, {})
field_result = ALL_QUERYABLES.get(field_name, {}).copy()
properties[field_name] = field_result

field_name_human = field_name.replace("_", " ").title()
Expand All @@ -104,9 +133,10 @@ async def get_queryables(
enum_fields[field_fqn] = field_result

if enum_fields:
for field_fqn, unique_values in (
await self.database.get_items_unique_values(collection_id, enum_fields)
).items():
enum_fields[field_fqn]["enum"] = unique_values
unique_values = await self.database.get_items_unique_values(
collection_id, enum_fields
)
for field_fqn, values in unique_values.items():
enum_fields[field_fqn]["enum"] = values

return queryables