fix two performance bottlenecks in the gribjump source#877
Merged
sandorkertesz merged 1 commit intodevelopfrom Jan 23, 2026
Merged
fix two performance bottlenecks in the gribjump source#877sandorkertesz merged 1 commit intodevelopfrom
sandorkertesz merged 1 commit intodevelopfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #877 +/- ##
========================================
Coverage 85.35% 85.35%
========================================
Files 184 184
Lines 14562 14562
Branches 732 732
========================================
Hits 12429 12429
Misses 1922 1922
Partials 211 211 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
443fbc6 to
e3c4c35
Compare
Collaborator
|
@andreas-grafberger thank you for this improvement. Please synch this PR with develop and it can be merged. |
Avoid recomputing per-field data by caching/precomputing: - Cache reference lat/lon when fetch_coords_from_fdb=True to avoid re-reading the reference field's geography per retrieved field. - Pre-convert index lists to ranges once to avoid repeated calls to ExtractionRequest.from_indices.
eea9c61 to
b44dce5
Compare
Contributor
Author
@sandorkertesz Thank you! I rebased onto develop and the gribjump tests run locally on my machine with the latest commit. From my side, either merging or waiting for #878 would be fine, which enables the gribjump tests in the CI. There are still a few hurdles I ran into, which will delay this by a couple of days. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes two small performance bottlenecks that made retrieval for large time-series unnecessarily slow:
fetch_coords_from_fdb=True, the reference latitudes and longitudes were read from the reference field again for each individually retrieved field. Originally, I somehow assumed they would be cached together with theGribMetadataobject and only noticed this issue recently during some benchmarks. I propose to simply fix this for now by explicitly caching latitudes/longitudes.indicesinstead ofrangeswere used, they would get converted into a list of ranges (by this function) again for each single field. As a simple fix, I propose that we just convert indices to ranges once directly in this source. I will follow up on this in a future PR to find a cleaner fix.I performed a few ad-hoc benchmarks for these two fixes:
ExtractionRequest.from_indices). This fix would remove this overhead.Contributor Declaration
By opening this pull request, I affirm the following: