-
Notifications
You must be signed in to change notification settings - Fork 15
feat: add stac:collections to spec
#89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
gadomski
wants to merge
1
commit into
main
Choose a base branch
from
issues/88-multiple-collections
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,11 +31,11 @@ most of the fields should be the same in STAC and in GeoParquet. | |
| | _property columns_ | _varies_ | - | Each property should use the relevant Parquet type, and be pulled out of the properties object to be a top-level Parquet field | | ||
|
|
||
| - Must be valid GeoParquet, with proper metadata. Ideally the geometry types are defined and as narrow as possible. | ||
| - Strongly recommend to only have one GeoParquet per STAC 'Collection'. Not doing this will lead to an expanded GeoParquet schema (the union of all the schemas of the collection) with lots of empty data | ||
| - Recommend to only have one GeoParquet per STAC 'Collection'. Not doing this will lead to an expanded GeoParquet schema (the union of all the schemas of the collection) with lots of empty data | ||
| - Any field in 'properties' of the STAC item should be moved up to be a top-level field in the GeoParquet. | ||
| - STAC GeoParquet does not support properties that are named such that they collide with a top-level key. | ||
| - datetime columns should be stored as a [native timestamp][timestamp], not as a string | ||
| - The Collection JSON should be included in the Parquet metadata. See [Collection JSON](#including-a-stac-collection-json-in-a-stac-geoparquet-collection) below. | ||
| - The Collection(s) JSON should be included in the Parquet metadata. See [Collection JSON](#including-one-or-more-stac-collection-json-in-a-stac-geoparquet-collection) below. | ||
| - Any other properties that would be stored as GeoJSON in a STAC JSON Item (e.g. `proj:geometry`) should be stored as a binary column with WKB encoding. This simplifies the handling of collections with multiple geometry types. | ||
|
|
||
| ### Link Struct | ||
|
|
@@ -69,12 +69,30 @@ To take advantage of Parquet's columnar nature and compression, the assets shoul | |
|
|
||
| See [Asset Object][asset] for more. | ||
|
|
||
| ## Including a STAC Collection JSON in a STAC Geoparquet Collection | ||
| ## Including one or more STAC Collection JSON in a STAC Geoparquet Collection | ||
|
|
||
| To make a stac-geoparquet file a fully self-contained representation, you can | ||
| include the Collection JSON in the Parquet metadata. If present in the [Parquet | ||
| file metadata][parquet-metadata], the key must be `stac:collection` and the | ||
| value must be a JSON string with the Collection JSON. | ||
| include one or more Collection JSON in the Parquet metadata. If present in the [Parquet | ||
| file metadata][parquet-metadata], the key must be `stac:collections` and the | ||
| value must be a JSON string with the Collections JSON as an object, keyed by the collection id. | ||
|
|
||
| Abbreviated example in JSON form: | ||
|
|
||
| ```json | ||
| { | ||
| "stac:collections": "{\"collection-a\":{\"id\":\"collection-a\",...}}" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've confused myself, but do we need the escaping here? I guess the text does say that the value of |
||
| } | ||
| ``` | ||
|
|
||
| ### Deprecations in v1.1 | ||
|
|
||
| Prior to stac-geoparquet v1.1, this specification recommended storing a single STAC collection in a `stac:collection` metadata field. | ||
| If possible, clients should continue to support this field (with a warning) for backwards compatibility. | ||
| **If both `stac:collection` and `stac:collections` are present in the stac-geoparquet metadata, it is an error.** | ||
|
|
||
| `stac:collection` will be removed from this specification in the next breaking release. | ||
|
|
||
| See [this RFC](https://github.com/stac-utils/stac-geoparquet/issues/88) for details. | ||
|
|
||
| ## Referencing a STAC Geoparquet Collections in a STAC Collection JSON | ||
|
|
||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rephrase this as strongly recommending that the records be somewhat uniform? That gets to the core if the issue (avoiding a bloated schema, lean on the strengths of parquet), and whether this comes from one or many collections is secondary. So maybe something like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And does it make sense to mention the fields extension here?