This repository was archived by the owner on Apr 8, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
snotel API does not implement pagination and thus makes joining a bit harder #77
Copy link
Copy link
Open
Labels
Description
snotel does not implement pagination. For locations and parameters this is ok since you can just fetch the entire API in one fetch and it is under the limit on the data transfer.
However for timeseries data you cannot fetch this all at once since there is too much.
curl 'https://wcc.sc.egov.usda.gov/awdbRestApi/services/v1/data?elements=*&stationTriplets=*:*:SNTL'
{"timestamp":1742847449771,"status":400,"error":"Bad Request","message":"The request exceeds the maximum number of 1000 station elements.","path":"/awdbRestApi/services/v1/data"}⏎
If you curl a smaller subset you can see there is no metadata about pages or period of record
As a result you need to do joins between API responses in a way that is a bit messy.
- there is no metadata about how many responses are in a result or the link to the next page, so you need to fetch the data first with a dummy range; then get the
beginDate/endDateand use that in another fetch.
curl 'https://wcc.sc.egov.usda.gov/awdbRestApi/services/v1/data?elements=*&stationTriplets=908:WA:SNTL&beginDate=1900-01-01%2001:01' | jq . | head -n 50
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3911k 0 3911k 0 0 330k 0 --:--:-- 0:00:11 --:--:-- 1084k
[
{
"stationTriplet": "908:WA:SNTL",
"data": [
{
"stationElement": {
"elementCode": "PRCP",
"ordinal": 1,
"durationName": "DAILY",
"dataPrecision": 1,
"storedUnitCode": "in",
"originalUnitCode": "in",
"beginDate": "1994-09-16 00:00",
"endDate": "2100-01-01 00:00",
"derivedData": true
},
"values": [
{
"date": "1994-10-01",
"value": 0.0
},
{
"date": "1994-10-02",
"value": 0.0
},
{
"date": "1994-10-03",
"value": 0.0
},
{
"date": "1994-10-04",
"value": 0.0
},
{
"date": "1994-10-05",
"value": 0.0
},
{
"date": "1994-10-06",
"value": 0.0
},
Remains to be tested more thoroughly but I believe that you can fetch all data all at once once as long as you aren't doing it for too many stations at once.
time curl 'https://wcc.sc.egov.usda.gov/awdbRestApi/services/v1/data?elements=*&stationTriplets=908:WA:SNTL&beginDate=1900-01-01%2001:01&endDate=2080-01-01%2001:01' | jq .
That takes around 15 seconds meaning it must be cached to have reasonable performance
Reactions are currently unavailable