Performance improvement advice for batch API call #395

LinhChay00 · 2024-12-28T20:16:20Z

LinhChay00
Dec 28, 2024

Hi everyone,

I am trying to scrape the weather data (including temperature, wind, summary) for a long period (all days in around 10 years) in around 95 places. I wrote a script to catch the data in batch (because Pirate Weather only allows to have one API call at a time). Yet my script runs really slowly even when I just test for 2 days in 2 different places

Does anyone have any advice for writing the script to catch the data in batch in a more effective way? Or do you know any sample script for batch catching that I can refer to? Please kindly have to advice

I tried to write a "loop" script and ask ChatGPT for using asyncio,aiohttp (the script is attached), but both of them do not seem to help the performance.
script_text.txt

Any suggestions or advice from you is highly appreciative to me, Thank you all for your help in advance

Answered by alexander0042

Feb 26, 2025

Hi, thanks for opening this discussion, and I'm so sorry for my slow reply, No satisfactory reply at the moment here, since this is something of a limitation to the structure of the API. Since it has to check a bunch of files to complete each time machine request, it's poorly suited to time-series data retrieval.

Your best bet here (by far) is going directly to the source at ECMWF. They have a great data retrieval tool for ERA5, and it's very easy to use https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview

View full answer

cloneofghosts · 2025-01-15T21:16:34Z

cloneofghosts
Jan 15, 2025
Collaborator

Thanks for opening this and sorry about the slow reply here. In general the Time Machine endpoint is pretty slow (see this comment for more info) and there was a recent change to the time machine endpoint where depending on your plan the number of requests is limited.

I'll ping @alexander0042 or you can try emailing mail@pirateweather.net to see if there's anything you can do to speed up response times on your end.

0 replies

alexander0042 · 2025-02-26T20:43:24Z

alexander0042
Feb 26, 2025
Maintainer

Hi, thanks for opening this discussion, and I'm so sorry for my slow reply, No satisfactory reply at the moment here, since this is something of a limitation to the structure of the API. Since it has to check a bunch of files to complete each time machine request, it's poorly suited to time-series data retrieval.

Your best bet here (by far) is going directly to the source at ECMWF. They have a great data retrieval tool for ERA5, and it's very easy to use https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview

1 reply

LinhChay00 Feb 27, 2025
Author

Hei Alexander,
Many thanks for your reply and suggestion. I would follow your suggestion to test, will share if I can do sth nicely

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance improvement advice for batch API call #395

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Performance improvement advice for batch API call #395

Uh oh!

Uh oh!

LinhChay00 Dec 28, 2024

Replies: 2 comments · 1 reply

Uh oh!

Uh oh!

cloneofghosts Jan 15, 2025 Collaborator

Uh oh!

alexander0042 Feb 26, 2025 Maintainer

Uh oh!

LinhChay00 Feb 27, 2025 Author

LinhChay00
Dec 28, 2024

Replies: 2 comments 1 reply

cloneofghosts
Jan 15, 2025
Collaborator

alexander0042
Feb 26, 2025
Maintainer

LinhChay00 Feb 27, 2025
Author