1+ <!-- omit from toc -->
12# STAC Task (stac-task)
23
34[ ![ Build Status] ( https://github.com/stac-utils/stac-task/workflows/CI/badge.svg?branch=main )] ( https://github.com/stac-utils/stac-task/actions/workflows/continuous-integration.yml )
67[ ![ codecov] ( https://codecov.io/gh/stac-utils/stac-task/branch/main/graph/badge.svg )] ( https://codecov.io/gh/stac-utils/stac-task )
78[ ![ License] ( https://img.shields.io/badge/License-Apache%202.0-blue.svg )] ( https://opensource.org/licenses/Apache-2.0 )
89
10+ - [ Quickstart for Creating New Tasks] ( #quickstart-for-creating-new-tasks )
11+ - [ Task Input] ( #task-input )
12+ - [ ProcessDefinition Object] ( #processdefinition-object )
13+ - [ UploadOptions Object] ( #uploadoptions-object )
14+ - [ path\_ template] ( #path_template )
15+ - [ collections] ( #collections )
16+ - [ tasks] ( #tasks )
17+ - [ TaskConfig Object] ( #taskconfig-object )
18+ - [ Full Process Definition Example] ( #full-process-definition-example )
19+ - [ Migration] ( #migration )
20+ - [ 0.4.x -\> 0.5.x] ( #04x---05x )
21+ - [ Development] ( #development )
22+ - [ Contributing] ( #contributing )
23+
924This Python library consists of the Task class, which is used to create custom tasks based
1025on a "STAC In, STAC Out" approach. The Task class acts as wrapper around custom code and provides
1126several convenience methods for modifying STAC Items, creating derived Items, and providing a CLI.
@@ -17,7 +32,7 @@ This library is based on a [branch of cirrus-lib](https://github.com/cirrus-geo/
1732``` python
1833from typing import Any
1934
20- from stactask import Task
35+ from stactask import Task, DownloadConfig
2136
2237class MyTask (Task ):
2338 name = " my-task"
@@ -30,7 +45,10 @@ class MyTask(Task):
3045 item = self .items[0 ]
3146
3247 # download a datafile
33- item = self .download_item_assets(item, assets = [' data' ])
48+ item = self .download_item_assets(
49+ item,
50+ config = DownloadConfig(include = [' data' ])
51+ )
3452
3553 # operate on the local file to create a new asset
3654 item = self .upload_item_assets_to_s3(item)
@@ -41,32 +59,32 @@ class MyTask(Task):
4159
4260## Task Input
4361
44- | Field Name | Type | Description |
45- | ------------- | ---- | ----------- |
46- | type | string | Must be FeatureCollection |
47- | features | [ Item] | A list of STAC ` Item ` |
48- | process | ProcessDefinition | A Process Definition |
62+ | Field Name | Type | Description |
63+ | ---------- | ----------------- | -------------- ----------- |
64+ | type | string | Must be FeatureCollection |
65+ | features | [ Item] | A list of STAC ` Item ` |
66+ | process | ProcessDefinition | A Process Definition |
4967
5068### ProcessDefinition Object
5169
5270A STAC task can be provided additional configuration via the 'process' field in the input
5371ItemCollection.
5472
55- | Field Name | Type | Description |
56- | ------------- | ---- | ----------- |
57- | description | string | Optional description of the process configuration |
58- | upload_options | UploadOptions | Options used when uploading assets to a remote server |
59- | tasks | Map<str, Map> | Dictionary of task configurations. A List of [ task configurations] ( #taskconfig-object ) is supported for backwards compatibility reasons, but a dictionary should be preferred. |
73+ | Field Name | Type | Description |
74+ | -------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------- |
75+ | description | string | Optional description of the process configuration |
76+ | upload_options | UploadOptions | Options used when uploading assets to a remote server |
77+ | tasks | Map<str, Map> | Dictionary of task configurations. A list of [ task configurations] ( #taskconfig-object ) is supported for backwards compatibility reasons, but a dictionary should be preferred. |
6078
6179#### UploadOptions Object
6280
63- | Field Name | Type | Description |
64- | ------------- | ---- | ----------- |
65- | path_template | string | ** REQUIRED** A string template for specifying the location of uploaded assets |
66- | public_assets | [ str] | A list of asset keys that should be marked as public when uploaded |
67- | headers | Map<str, str> | A set of key, value headers to send when uploading data to s3 |
68- | collections | Map<str, str> | A mapping of output collection name to a JSONPath pattern (for matching Items) |
69- | s3_urls | bool | Controls if the final published URLs should be an s3 (s3://* bucket* /* key* ) or https URL |
81+ | Field Name | Type | Description |
82+ | ------------- | ------------- | ---------------------------------------------------------------------------- ----------- |
83+ | path_template | string | ** REQUIRED** A string template for specifying the location of uploaded assets |
84+ | public_assets | [ str] | A list of asset keys that should be marked as public when uploaded |
85+ | headers | Map<str, str> | A set of key, value headers to send when uploading data to s3 |
86+ | collections | Map<str, str> | A mapping of output collection name to a JSONPath pattern (for matching Items) |
87+ | s3_urls | bool | Controls if the final published URLs should be an s3 (s3://* bucket* /* key* ) or https URL |
7088
7189##### path_template
7290
@@ -121,10 +139,10 @@ would have `param2=value2` passed. If there were a `task-b` to be run it would n
121139
122140A Task Configuration contains information for running a specific task.
123141
124- | Field Name | Type | Description |
125- | ------------- | ---- | ----------- |
126- | name | str | ** REQUIRED** Name of the task |
127- | parameters | Map<str, str> | Dictionary of keyword parameters that will be passed to the Tasks ` process ` function |
142+ | Field Name | Type | Description |
143+ | ---------- | ------------- | ------------------------------------------------------------------------- ----------- |
144+ | name | str | ** REQUIRED** Name of the task |
145+ | parameters | Map<str, str> | Dictionary of keyword parameters that will be passed to the Tasks ` process ` function |
128146
129147## Full Process Definition Example
130148
@@ -147,6 +165,83 @@ Process definitions are sometimes called "Payloads":
147165}
148166```
149167
168+ ## Migration
169+
170+ ### 0.4.x -> 0.5.x
171+
172+ In 0.5.0, the previous use of fsspec to download Item Assets has been replaced with
173+ the stac-asset library. This has necessitated a change in the parameters
174+ that the download methods accept.
175+
176+ The primary change is that the Task methods ` download_item_assets ` and
177+ ` download_items_assets ` (items plural) now accept fewer explicit and implicit
178+ (kwargs) parameters.
179+
180+ Previously, the methods looked like:
181+
182+ ``` python
183+ def download_item_assets (
184+ self ,
185+ item : Item,
186+ path_template : str = " ${collection} /${id} " ,
187+ keep_original_filenames : bool = False ,
188+ ** kwargs : Any,
189+ ) -> Item:
190+ ```
191+
192+ but now look like:
193+
194+ ``` python
195+ def download_item_assets (
196+ self ,
197+ item : Item,
198+ path_template : str = " ${collection} /${id} " ,
199+ config : Optional[DownloadConfig] = None ,
200+ ) -> Item:
201+ ```
202+
203+ Similarly, the ` asset_io ` package methods were previously:
204+
205+ ``` python
206+ async def download_item_assets (
207+ item : Item,
208+ assets : Optional[list[str ]] = None ,
209+ save_item : bool = True ,
210+ overwrite : bool = False ,
211+ path_template : str = " ${collection} /${id} " ,
212+ absolute_path : bool = False ,
213+ keep_original_filenames : bool = False ,
214+ ** kwargs : Any,
215+ ) -> Item:
216+ ```
217+
218+ and are now:
219+
220+ ``` python
221+ async def download_item_assets (
222+ item : Item,
223+ path_template : str = " ${collection} /${id} " ,
224+ config : Optional[DownloadConfig] = None ,
225+ ) -> Item:
226+ ```
227+
228+ Additionally, ` kwargs ` keys were set to pass configuration through to fsspec. The most common
229+ parameter was ` requester_pays ` , to set the Requester Pays flag in AWS S3 requests.
230+
231+ Many of these parameters can be directly translated into configuration passed in a
232+ ` DownloadConfig ` object, which is just a wrapper over the ` stac_asset.Config ` object.
233+
234+ Migration of these various parameters to ` DownloadConfig ` are as follows:
235+
236+ - ` assets ` : set ` include `
237+ - ` requester_pays ` : set ` s3_requester_pays ` = True
238+ - ` keep_original_filenames ` : set ` file_name_strategy ` to
239+ ` FileNameStrategy.FILE_NAME ` if True or ` FileNameStrategy.KEY ` if False
240+ - ` overwrite ` : set ` overwrite `
241+ - ` save_item ` : none, Item is always saved
242+ - ` absolute_path ` : none. To create or retrieve the Asset hrefs as absolute paths, use either
243+ ` Item#make_all_asset_hrefs_absolute() ` or ` Asset#get_absolute_href() `
244+
150245## Development
151246
152247Clone, install in editable mode with development requirements, and install the ** pre-commit** hooks:
0 commit comments