Skip to content

Commit 538f8e6

Browse files
authored
Add processing:version and processing:datetime (#32)
* Add processing:version and processing:datetime * Various clarifications, added new rel type
1 parent 4057e76 commit 538f8e6

File tree

5 files changed

+97
-40
lines changed

5 files changed

+97
-40
lines changed

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88

99
### Added
1010

11+
- `processing:version` field to describe the primary software version of workflow version that produced the data
12+
- `processing:datetime` field to describe when the processing happened
1113
- `processing-execution` relation type to link to the processing execution that produced the data.
14+
- `processing-software` relation type to link to the processing execution that produced the data.
15+
16+
### Changed
17+
18+
### Deprecated
19+
20+
### Removed
21+
22+
### Fixed
1223

1324
## [v1.1.0] - 2022-01-07
1425

README.md

Lines changed: 47 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,38 +22,76 @@ and therefore are shared across all items, it is recommended adding the fields t
2222
- [JSON Schema](json-schema/schema.json)
2323
- [Changelog](./CHANGELOG.md)
2424

25-
## Item Properties and Collection Provider Fields
25+
## Fields
2626

2727
| Field Name | Type | Description |
2828
| ----------------------- | ------------------- | ----------- |
2929
| processing:expression | [Expression Object](#expression-object) | An expression or processing chain that describes how the data has been processed. Alternatively, you can also link to a processing chain with the relation type `processing-expression` (see below). |
3030
| processing:lineage | string | Lineage Information provided as free text information about the how observations were processed or models that were used to create the resource being described [NASA ISO](https://wiki.earthdata.nasa.gov/display/NASAISO/Lineage+Information). For example, `GRD Post Processing` for "GRD" product of Sentinel-1 satellites. [CommonMark 0.29](https://commonmark.org/) syntax MAY be used for rich text representation. |
3131
| processing:level | string | The name commonly used to refer to the processing level to make it easier to search for product level across collections or items. The short name must be used (only `L`, not `Level`). See the [list of suggested processing levels](#suggested-processing-levels). |
3232
| processing:facility | string | The name of the facility that produced the data. For example, `Copernicus S1 Core Ground Segment - DPA` for product of Sentinel-1 satellites. |
33-
| processing:software | Map<string, string> | A dictionary with name/version for key/value describing one or more softwares that produced the data. For example, `"Sentinel-1 IPF":"002.71"` for the software that produces Sentinel-1 satellites data. |
33+
| processing:datetime | string | Processing date and time of the corresponding data formatted according to [RFC 3339, section 5.6](https://tools.ietf.org/html/rfc3339#section-5.6), in UTC. |
34+
| processing:version | string | The version of the primary processing software or processing chain that produced the data. For example, this could be the processing baseline for the Sentinel missions. |
35+
| processing:software | Map<string, string> | A dictionary with name/version for key/value describing one or more applications or libraries that were involved during the production of the data for provenance purposes. |
3436

35-
These fields can be used in a variety of places:
37+
The fields in the table above can be used in these parts of STAC documents:
38+
- [ ] Catalogs
39+
- [ ] Collections
40+
- [x] [Collection Provider](https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#provider-object)
41+
- [x] Item Properties (incl. Summaries in Collections)
42+
- [x] Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
43+
- [ ] Links
44+
45+
In more detail, the following restrictions apply:
3646

3747
1. Items:
38-
- The fields are placed in the properties. At least one field is required to be present.
48+
- The fields are usually placed in the properties. At least one field is required to be present.
3949
- Additionally, STAC allows all fields to be used in the Asset Object.
4050

4151
2. Collections:
4252
- The fields are usually placed in the [Provider Objects](https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#provider-object)
4353
for the `providers` that have the role `producer` or `processor` assigned.
4454
They don't need to be provided for all providers of the respective role.
4555
- The fields can also be used in `summaries`, Collection `assets` or Item asset definitions (`item_assets`).
56+
Please note that the JSON Schema is not be able to validate the values of Collection summaries.
4657

47-
If the extension is given in the `stac_extensions` list, at least one of the fields must be specified in any of the given places listed above.
48-
Please note that the JSON Schema is not be able to validate the values of Collection summaries.
58+
If the extension is given in the `stac_extensions` list, at least one of the fields must be specified in any of the given places listed above.
4959

5060
### Processing Date Time
5161

52-
The time of the processing is directly specified via the `created` properties of the target asset as specified in the [STAC Common metadata](https://github.com/radiantearth/stac-spec/blob/master/item-spec/common-metadata.md#date-and-time)
62+
The time of the processing can be specified as a global field in `processing:datetime`,
63+
but it can also be specified directly and individually via the `created` properties of the target asset
64+
as specified in the [STAC Common metadata](https://github.com/radiantearth/stac-spec/blob/master/item-spec/common-metadata.md#date-and-time).
65+
66+
`created` in Item properties describes the STAC metadata creation and in Assets it describes the creation of the data files.
67+
Thus the timestamps provided in Item Properties for `created` and `processing:datetime` may differ.
68+
As Item properties are easier to be indexed and used for filtering purposes, `processing:datetime` exists.
69+
`created` and `processing:datetime` should usually be the same value in Assets and as such `processing:datetime`
70+
can usually be omitted.
71+
72+
### Version Numbers
73+
74+
Three fields exist for version numbers:
75+
- `processing:software`
76+
- `processing:version`
77+
- `version` (in the [Version extension](https://github.com/stac-extensions/version))
78+
79+
The different fields exist to give data providers more flexibility depending on their needs.
80+
81+
In Item Properties:
82+
- `processing:version` is useful if a single version number is available for the metadata or data that users should be able to filter on.
83+
A popular example for this is the processing baseline in Sentinel missions.
84+
- `processing:software` is used if the software libraries/tools are important to know, but it's not important to filter on them.
85+
They are mostly informative and important to be complete for reporducibility purposes.
86+
Thus, the values in the object can not just be version numbers, but also be e.g. tag names, commit hashes or similar.
87+
For example, you could expose a simplified version of the `Pipfile.lock` (Python) or `package-lock.json` (NodeJS).
88+
If you need more information, you could also link to such files via the relation type `processing-software`.
89+
- `version` is usually not used in the context of processing and describes the version of the metadata.
5390

5491
### Linking the Items
5592

56-
In Items that declare this `processing` extension, it is recommended to add one or more [Links](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md#relation-types) with `derived_from` or `via` relationships to the eventual source metadata & data used in the processing. They could be used to trace back the processing history of the dataset.
93+
In Items that declare this `processing` extension, it is recommended to add one or more [Links](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md#relation-types) with `derived_from` or `via` relationships to the eventual source metadata & data used in the processing.
94+
They could be used to trace back the processing history of the dataset.
5795

5896
### Suggested Processing Levels
5997

@@ -99,6 +137,7 @@ The following types should be used as applicable `rel` types in the
99137
| derived_from | URL to a STAC Item that was used as input data in the creation of this Item. |
100138
| processing-expression | A processing chain (or script) that describes how the data has been processed. |
101139
| processing-execution | URL to any resource representing the processing execution (e.g. OGC Process API). |
140+
| processing-software | URL to any resource that identifies the software and versions used for processing the data, e.g. a `Pipfile.lock` (Python) or `package-lock.json` (NodeJS). |
102141

103142
## Contributing
104143

examples/collection.json

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,9 @@
1717
],
1818
"url": "https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi",
1919
"processing:lineage": "Generation of Level-1C User Product",
20-
"processing:level": "L1C",
20+
"processing:level": "L1",
2121
"processing:facility": "Copernicus S2 Processing and Archiving Facility",
22-
"processing:software": {
23-
"IPF-S2L1C": "02.06"
24-
}
22+
"processing:version": "02.06"
2523
},
2624
{
2725
"name": "Processing Corp.",
@@ -82,8 +80,8 @@
8280
60
8381
],
8482
"processing:level": [
85-
"L1C",
86-
"L2A"
83+
"L1",
84+
"L2"
8785
]
8886
},
8987
"links": [

examples/item.json

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,12 @@
2929
],
3030
"sar:product_type": "GRD",
3131
"processing:lineage": "GRD Post Processing",
32-
"processing:level": "L1C",
32+
"processing:level": "L1",
3333
"processing:facility": "Copernicus S1 Core Ground Segment - DPA",
3434
"processing:software": {
3535
"Sentinel-1 IPF": "002.71"
36-
}
36+
},
37+
"processing:datetime": "2016-08-23T00:30:33Z"
3738
},
3839
"links": [
3940
{

json-schema/schema.json

Lines changed: 32 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,18 @@
33
"$id": "https://stac-extensions.github.io/processing/v1.1.0/schema.json#",
44
"title": "Processing Extension",
55
"description": "STAC Processing Extension for STAC Items and STAC Collections.",
6+
"type": "object",
7+
"required": [
8+
"stac_extensions"
9+
],
10+
"properties": {
11+
"stac_extensions": {
12+
"type": "array",
13+
"contains": {
14+
"const": "https://stac-extensions.github.io/processing/v1.1.0/schema.json"
15+
}
16+
}
17+
},
618
"anyOf": [
719
{
820
"$comment": "This is the schema for STAC Items.",
@@ -33,12 +45,7 @@
3345
"$ref": "#/definitions/fields"
3446
}
3547
}
36-
},
37-
"allOf": [
38-
{
39-
"$ref": "#/definitions/stac_extensions"
40-
}
41-
]
48+
}
4249
},
4350
{
4451
"$comment": "This is the schema for STAC Collections.",
@@ -72,11 +79,6 @@
7279
}
7380
}
7481
},
75-
"allOf": [
76-
{
77-
"$ref": "#/definitions/stac_extensions"
78-
}
79-
],
8082
"anyOf": [
8183
{
8284
"$comment": "Requires at least one provider to contain processing fields.",
@@ -170,18 +172,6 @@
170172
],
171173
"definitions": {
172174
"stac_extensions": {
173-
"type": "object",
174-
"required": [
175-
"stac_extensions"
176-
],
177-
"properties": {
178-
"stac_extensions": {
179-
"type": "array",
180-
"contains": {
181-
"const": "https://stac-extensions.github.io/processing/v1.1.0/schema.json"
182-
}
183-
}
184-
}
185175
},
186176
"require_provider_role": {
187177
"type": "object",
@@ -206,7 +196,9 @@
206196
{"type": "object", "required": ["processing:lineage"]},
207197
{"type": "object", "required": ["processing:level"]},
208198
{"type": "object", "required": ["processing:facility"]},
209-
{"type": "object", "required": ["processing:software"]}
199+
{"type": "object", "required": ["processing:software"]},
200+
{"type": "object", "required": ["processing:version"]},
201+
{"type": "object", "required": ["processing:datetime"]}
210202
]
211203
},
212204
"fields": {
@@ -257,6 +249,22 @@
257249
"Copernicus S1 Core Ground Segment - DPA"
258250
]
259251
},
252+
"processing:version": {
253+
"title": "Processing Version",
254+
"type": "string",
255+
"examples": [
256+
"0.2.0"
257+
]
258+
},
259+
"processing:datetime": {
260+
"title": "Processing Datetime",
261+
"type": "string",
262+
"format": "date-time",
263+
"pattern": "(\\+00:00|Z)$",
264+
"examples": [
265+
"2020-01-05T12:34:55Z"
266+
]
267+
},
260268
"processing:software": {
261269
"title": "Processing Software Name / version",
262270
"type": "object",

0 commit comments

Comments
 (0)