Skip to content

SuperSTAC - Problem Statement #1

@jeafreezy

Description

@jeafreezy

🚀 Overview

SuperSTAC is a Python package (and CLI) that transparently searches across multiple STAC catalogs to maximize imagery & metadata availability, offering users:

  • A unified interface for querying diverse STAC endpoints.
  • Automatic catalog selection & fallback logic.
  • Band combinations, resolution, and date‐specific filtering (all filtering supported by STAC).
  • Extensible registry of catalogs with metadata about strengths/use-cases.

🛠️ Problem Statement

  • Single‐catalog limitations: Depending on one STAC server (e.g., Sentinel via Element 84) can result in:

    • Data gaps: Desired date/location isn’t covered by Sentinel/Landsat.

    • Downtime: Server outages render retrieval impossible.

  • Fragmented discovery: Analysts manually hunt through dozens of public & commercial STAC catalogs to find optimal imagery.

Goals

  • Maximize data availability: Seamlessly retrieve imagery from the best‐available catalog(s) at a given time/location.

  • Intelligent catalog registry: Maintain a registry of known STAC endpoints, annotated with:

    • Supported sensors/platforms (e.g., PlanetScope, Maxar, Sentinel-2)
    • Typical resolutions & band sets
    • Geographic & temporal coverage hints
    • Flexible query interface
  • Allow users to specify:

    • Spatial footprint (GeoJSON, bounding box)
    • Date or date range
    • Desired spatial resolution
    • Band combinations or spectral indices (e.g., RGB, NDVI)
    • Maximum cloud cover, cost constraints, etc.
  • Resilient retrieval; Implement fallback logic so that if Catalog A fails or returns no results, queries automatically continue through Catalog B, C, …

🔍 Proposed Design

  1. Catalog Registry
  • YAML/JSON file listing:
- name: Sentinel-2
  url: https://earth-search.aws.element84.com/v0
  sensors: [“S2”]
  resolution: 10      # in meters
  bands: [“B04”, “B03”, “B02”, …]
- name: PlanetScope
  url: https://api.planet.com/stac/v1
  sensors: [“PSScene4Band”]
  resolution: 3
  bands: [“B1”, “B2”, “B3”, “B4”]
  • Metadata fields for coverage hints (global/regional, daily revisit, etc.).
  1. Query Engine
  • Unified function:
search(
    geometry: dict,
    datetime: str,
    resolution: float = None,
    bands: List[str] = None,
    max_cloud_cover: float = None,
    catalogs: List[str] = None
) -> List[ItemCollection]
  • Internally loops through prioritized catalogs with:
  1. Catalog health check (ping//collections).

  2. Parameter compatibility (e.g., does this catalog support resolution filtering?).

  3. Fallback on no‐results or errors.

  4. Retrieval & Download

  • Standardized download API:
download(item_id: str, bands: List[str], out_dir: str)
  • Optional on‐the‐fly mosaicking, reprojection, band‐stacking.
  1. CLI & Config
  • superstac register <catalog-config.yaml>
  • superstac search --geo=footprint.geojson --date=2024-06-05 --resolution=10
  • Configurable defaults (~/.superstac/config.toml) for credentials, preferred catalogs, etc.

✅ Acceptance Criteria

  • Ability to register new STAC catalogs via config file.
  • Search function returns lists from the highest‐priority catalog with valid results.
  • Automatic fallback to secondary catalogs on empty/error responses.
  • Support for resolution & band filtering across heterogeneous endpoints.
  • Example scripts/notebooks demonstrating use cases (illegal mining, disaster response).

📋 Use Cases

  • Illegal Mining Detection

“I need medium-resolution imagery over Region X on June 5, 2024. Sentinel/Landsat have no coverage—query PlanetScope or other high-res catalogs instead.”

  • Rapid Disaster Response

“After an earthquake, retrieve any available sub-meter or 1–3 m imagery within 24 hours from multiple providers.”

  • Time Series Analysis

“Build a stack of monthly NDVI composites for 2023 using whichever catalog offers the cleanest data each month.”

  • Operational Resilience

“If Element 84’s STAC endpoint is down, automatically fall back to alternative public catalogs (USGS, AWS, etc.) without user intervention.”

  • LLM-Assisted Query Generation

“Provide a natural-language prompt like ‘high-res imagery of deforestation in the Amazon last month’—use an LLM to translate this into spatial, temporal, and catalog-specific query parameters, then execute across the registry.”

📂 Next Steps

  • Define minimal viable catalog registry format.
  • Prototype health‐check & search orchestration logic.
  • Implement core Python API & CLI skeleton.
  • Validate against 3–5 public STAC endpoints (AWS, Element 84, USGS, Planet).
  • Write documentation and example workflows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions