Skip to content

Feature/allegheny catalogs#32

Merged
sray014 merged 10 commits intoDewberry:devfrom
fema-ffrd:feature/allegheny_catalogs
Feb 26, 2026
Merged

Feature/allegheny catalogs#32
sray014 merged 10 commits intoDewberry:devfrom
fema-ffrd:feature/allegheny_catalogs

Conversation

@zherbz
Copy link
Copy Markdown
Contributor

@zherbz zherbz commented Feb 23, 2026

Ran stormhub for the Allegheny watershed and encountered some memory issues even when using an AWS EC2 with 100GB of RAM.

The issue was the script for generating the storm catalog kept getting killed due to OOM. The root cause of this was during the item creation where each item is saved to disk and then appended to the collection. The edit instead only saves to disk and returns None. Afterwards the items are added to the catalog by referencing memory on disk rather than memory in RAM.

The only relevant changes in the PR that should be merged are in the storm_catlog.py file.

Comment on lines +597 to +616
if item.bbox:
min_x = item.bbox[0] if min_x is None else min(min_x, item.bbox[0])
min_y = item.bbox[1] if min_y is None else min(min_y, item.bbox[1])
max_x = item.bbox[2] if max_x is None else max(max_x, item.bbox[2])
max_y = item.bbox[3] if max_y is None else max(max_y, item.bbox[3])

if item.datetime is not None:
min_time = item.datetime if min_time is None else min(min_time, item.datetime)
max_time = item.datetime if max_time is None else max(max_time, item.datetime)

if collection is None:
raise ValueError(f"No item JSON files found in: {collection_dir}")

if min_x is not None and min_time is not None:
collection.extent = pystac.Extent(
spatial=pystac.SpatialExtent(bboxes=[[min_x, min_y, max_x, max_y]]),
temporal=pystac.TemporalExtent(intervals=[[min_time, max_time]]),
)
else:
logging.warning("Unable to compute extent for collection '%s'.", collection_id)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the 'update_extent_from_items()' method is probably a better, cleaner way to set the extents.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now implemented

@sray014
Copy link
Copy Markdown
Collaborator

sray014 commented Feb 26, 2026

Is there a need for the .gitkeep and params-config.json files?

@zherbz
Copy link
Copy Markdown
Contributor Author

zherbz commented Feb 26, 2026

Is there a need for the .gitkeep and params-config.json files?

the .gitkeep is meant for the example catalog since the catalogs will contain other catalog datasets, but only the example is intended for tracking based on the .gitignore. I added the params-config.json as an example config to use since there is not currently one for people to reference

@zherbz zherbz requested a review from sray014 February 26, 2026 17:48
@sray014 sray014 merged commit 0e1fa4b into Dewberry:dev Feb 26, 2026
3 checks passed
@sray014 sray014 deleted the feature/allegheny_catalogs branch February 26, 2026 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants