Replies: 6 comments 2 replies
-
|
Hey @wjbgis, I appreciate this well written issue.
While nested STAC catalogs/collections are often requested on STAC users (I myself have made such requests), they're rarely necessary. I think the solution to your needs will be to model your data within a flat set of models. i.e. Don't try too hard to make the API match the UI, rather keep it closer to the database.
Would it not be sufficient to place It sounds like you know what you're talking about, so I apologize if my suggestion is obvious |
Beta Was this translation helpful? Give feedback.
-
|
@alukach Thanks for your reply. I see your point about keeping the API close to the database. The main issue I'm facing is that we have hundreds of heterogeneous datasets from different research teams, each with its own unique hierarchy (e.g., some use Mapping all those into unique flat properties would be a maintenance nightmare—we’d have to write and manage custom parsing/matching rules for every new dataset just to extract the right fields. To keep it manageable, I’m thinking of a middle ground:
I'm assuming a B-tree index on that path property should hold up fine. Do you think this is a solid enough workaround for this kind of heterogeneous data? Would love to hear your take before I go down this road. |
Beta Was this translation helpful? Give feedback.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
-
Thanks for the suggestion! I'll definitely dive into the ltree extension and see how it can work for my use case.
To clarify our dataset:
For example, in a collection called |
Beta Was this translation helpful? Give feedback.
-
|
Please make sure to review #88 (comment) as well. If you proceed to work on this, please sketch out your plans for any changes to the storage model before spending too much time working on things. Changes in the storage model are a big deal and are not something that can easily be added as they may cause the need for an entire data rewrite within the database which could make an update prohibitive for anyone with a very large existing STAC database. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Background
I am building a data discovery and download platform using pgstac with over 3 million Items. Unlike standard satellite imagery which is primarily searched by Spatio-Temporal filters, my dataset follows a deep business logic hierarchy common in scientific research (e.g.,
Product Name->City->Data Type->Train/Test Sets).The Problem
Current pgstac implementations and standard STAC API patterns make it difficult to render a "Directory Tree" UI efficiently for large-scale datasets:
Recursive Catalog Bottleneck: Implementing a hierarchical structure using nested Catalogs requires numerous recursive API calls, which is too slow for a seamless UI experience.
Aggregation Performance: Performing a SELECT DISTINCT on a path-like property (e.g., properties.path) within a 3M+ record JSONB column leads to full table scans, resulting in 5-15s latency—unacceptable for real-time navigation.
Delimiter-based Discovery Gap: There is currently no optimized way to "explore" sub-folders within a Collection without fetching all Items or hitting performance walls.
Request for Guidance
As my goal is to provide a seamless browsing experience for researchers to "drill down" through these logical folders, I would like to ask the community and maintainers:
Is there a recommended pattern within pgstac to handle such deep, non-spatial hierarchical navigation efficiently at scale?
Are there any plans to support features that would allow for "prefix-based" or "delimiter-based" discovery of unique property values (similar to a file system explorer) without fetching all Items?
I am very impressed with pgstac's performance on spatial queries, and I would appreciate your wisdom or any guidance on how to best bridge this gap for scientific data organization.
Thank you for your time and for this amazing project!
Beta Was this translation helpful? Give feedback.
All reactions