Performance improvement advice for batch API call #395
-
|
Hi everyone, I am trying to scrape the weather data (including temperature, wind, summary) for a long period (all days in around 10 years) in around 95 places. I wrote a script to catch the data in batch (because Pirate Weather only allows to have one API call at a time). Yet my script runs really slowly even when I just test for 2 days in 2 different places Does anyone have any advice for writing the script to catch the data in batch in a more effective way? Or do you know any sample script for batch catching that I can refer to? Please kindly have to advice I tried to write a "loop" script and ask ChatGPT for using asyncio,aiohttp (the script is attached), but both of them do not seem to help the performance. Any suggestions or advice from you is highly appreciative to me, Thank you all for your help in advance |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
Thanks for opening this and sorry about the slow reply here. In general the Time Machine endpoint is pretty slow (see this comment for more info) and there was a recent change to the time machine endpoint where depending on your plan the number of requests is limited. I'll ping @alexander0042 or you can try emailing mail@pirateweather.net to see if there's anything you can do to speed up response times on your end. |
Beta Was this translation helpful? Give feedback.
-
|
Hi, thanks for opening this discussion, and I'm so sorry for my slow reply, No satisfactory reply at the moment here, since this is something of a limitation to the structure of the API. Since it has to check a bunch of files to complete each time machine request, it's poorly suited to time-series data retrieval. Your best bet here (by far) is going directly to the source at ECMWF. They have a great data retrieval tool for ERA5, and it's very easy to use https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview |
Beta Was this translation helpful? Give feedback.
Hi, thanks for opening this discussion, and I'm so sorry for my slow reply, No satisfactory reply at the moment here, since this is something of a limitation to the structure of the API. Since it has to check a bunch of files to complete each time machine request, it's poorly suited to time-series data retrieval.
Your best bet here (by far) is going directly to the source at ECMWF. They have a great data retrieval tool for ERA5, and it's very easy to use https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels?tab=overview