-
Couldn't load subscription status.
- Fork 126
Open
Description
The pystac docs have this interesting S3/boto based StacIO implementation:
Lines 321 to 367 in 4dc0e0f
| For example, the following code examples will allow | |
| for reading from AWS's S3 cloud object storage using `boto3 | |
| <https://boto3.amazonaws.com/v1/documentation/api/latest/index.html>`__ | |
| or Azure Blob Storage using the `Azure SDK for Python | |
| <https://learn.microsoft.com/en-us/python/api/overview/azure/storage-blob-readme?view=azure-python>`__: | |
| .. tab-set:: | |
| .. tab-item:: AWS S3 | |
| .. code-block:: python | |
| from urllib.parse import urlparse | |
| import boto3 | |
| from pystac import Link | |
| from pystac.stac_io import DefaultStacIO, StacIO | |
| from typing import Union, Any | |
| class CustomStacIO(DefaultStacIO): | |
| def __init__(self): | |
| self.s3 = boto3.resource("s3") | |
| super().__init__() | |
| def read_text( | |
| self, source: Union[str, Link], *args: Any, **kwargs: Any | |
| ) -> str: | |
| parsed = urlparse(source) | |
| if parsed.scheme == "s3": | |
| bucket = parsed.netloc | |
| key = parsed.path[1:] | |
| obj = self.s3.Object(bucket, key) | |
| return obj.get()["Body"].read().decode("utf-8") | |
| else: | |
| return super().read_text(source, *args, **kwargs) | |
| def write_text( | |
| self, dest: Union[str, Link], txt: str, *args: Any, **kwargs: Any | |
| ) -> None: | |
| parsed = urlparse(dest) | |
| if parsed.scheme == "s3": | |
| bucket = parsed.netloc | |
| key = parsed.path[1:] | |
| self.s3.Object(bucket, key).put(Body=txt, ContentEncoding="utf-8") | |
| else: | |
| super().write_text(dest, txt, *args, **kwargs) | |
| StacIO.set_default(CustomStacIO) |
I fully understand this is not part of the core pystac project to minimize third party dependencies, but I wonder if there is any interest or plans for extracting this from the docs and instead package it in some kind of separate/extra package?
In the openEO GeoPysSpark driver, where we need this, we now just copy-pasted that snippet in an ad-hoc way, but it would be better for various reasons to properly decouple it
gadomski
Metadata
Metadata
Assignees
Labels
No labels