-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Before reading this article, read this email to FRED that proposes expanded definitions for real-time periods and vintage dates. The proposed definitions are used in this article since they are the only meaningful and consistent way to use the API and interpret it's output.
In this article we select a random series of GNPCA, a random observation period of 2018-01-01, and a random real-time period between 2019-05-01 and 2020-05-01. We ask the FRED API this question: Within the example period of interest (aka real-time period) between 2019-05-01 and 2020-05-01, what values did we know for the GNPCA 2018-01-01 observation period and when were those values released?
These are the some of the parameters we will use to construct queries. We will discover more parameters as we progress:
series_id=GNPCA
realtime_start=2019-05-01 (aka period of interest start)
realtime_end=2020-05-01 (aka period of interest end)
observation_start=2018-01-01
observation_end=2018-01-01
Note we are asking two questions of the API: 1.) What values for the 2018-01-01 observation period were known within our period of interest and 2.) when were those values released.
Before walking through the API to get the desired information, download a spreadsheet for GNPCA from ALFRED. Select All vintages, output type 2 by Vintage date all vintages. We will use this spreadsheet to determine in advance the responses we expect to see from the API.
Look at the row for the 2018-01-01 observation period. Note that the initial value for that observation period was released on 2019-03-28 (a vintage date). That date is prior to the start of the example real-time period but is the only release so it is the one that is in effect on the start of the real-time period. The only revision prior to the end of the example real-time end occurred on 2019-07-26 (a vintage date).
In summary, the correct response we expect to see from the API to the two questions posed above is:
Vintage 2019-03-28 - 18815.882 // initial release
Vintage 2019-07-26 - 18897.80 // revision, in effect until end of period of interest
Query the API using the parameters we have at hand
The query below is constructed using the observations api. Start and end dates of the period of interest are passed as real-time start and end date parameters. Output type 1 is selected.
https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&realtime_start=2019-05-01&realtime_end=2020-05-01&observation_start=2018-01-01&observation_end=2018-01-01&output_type=1&api_key=123
<observations realtime_start="2019-05-01" realtime_end="2020-05-01" observation_start="2018-01-01" observation_end="2018-01-01" units="lin" output_type="1" file_type="xml" order_by="observation_date" sort_order="asc" count="2" offset="0" limit="100000">
<observation realtime_start="2019-05-01" realtime_end="2019-07-25" date="2018-01-01" value="18815.882"/>
<observation realtime_start="2019-07-26" realtime_end="2020-05-01" date="2018-01-01" value="18897.8"/>
</observations>
The above response is incorrect for several reasons. Firstly, it is returning the start of the example period of interest (2019-05-01) as a release date for the value 18815.882. The correct release date (vintage date) of that value is 2019-03-28. Secondly, the one vintage date that it is returning correctly is mis-labeled as a real-time start date. There is only one correct real-time start date and it is 2019-05-01.
Same query with output_type 2:
https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&realtime_start=2019-05-01&realtime_end=2020-05-01&observation_start=2018-01-01&observation_end=2018-01-01&output_type=2&api_key=123
<observations realtime_start="2019-05-01" realtime_end="2020-05-01" observation_start="2018-01-01" observation_end="2018-01-01" units="lin" output_type="2" file_type="xml" order_by="observation_date" sort_order="asc" count="1" offset="0" limit="100000">
<observation date="2018-01-01" GNPCA_20190501="18815.882" GNPCA_20190726="18897.8" GNPCA_20200326="18897.8" GNPCA_20200501="18897.8"/>
</observations>
The above response is also incorrect. This response also co-mingles real-time periods with actual vintage dates. The response neglects to report the actual date the observation was initially released (2019-03-28) and instead reports the start of the real-time period as the vintage date. The format of this response is unparsable by most deserializers.
Correct way to use the API
As demonstrated above the FRED API becomes confused when trying to differentiate between real-time dates and vintage dates. Fortunately there is a way to use the API to obtain the desired information. It is lengthy but it can be done.
Step 1: Get dates when information was released about observation period 2018-01-01
The question we need to ask FRED is "Within the real-time period between 2019-05-01 and 2020-05-01, what vintages existed or were created that impacted our knowledge of GNPCA for the observation period of 2018-01-01?". To ask this question we construct a vintage date query using the example real-time start and end dates:
https://api.stlouisfed.org/fred/series/vintagedates?series_id=GNPCA&realtime_start=2019-05-01&realtime_end=2020-05-01&offset=0&api_key=123
<vintage_dates realtime_start="2019-05-01" realtime_end="2020-05-01" order_by="vintage_date" sort_order="asc" count="2" offset="0" limit="10000">
<vintage_date>2019-07-26</vintage_date>
<vintage_date>2020-03-26</vintage_date>
</vintage_dates>
The FRED documentation is not clear what this query should return so we can not say definitively whether the response is right or wrong. The response above appears to answer the question "What vintages were released between 2019-05-01 and 2020-05-01?". Of course this is not the question we intended to ask. We know that this query is not useful for our purpose because it does not return the date of the Vintage that was in effect on 2019-05-01.
Unfortunately, there is no way to query FRED for vintages that are effective within a real-time period. The only way to get the vintages we need is too request all vintages, manually or programmatically scan the list, and select the vintages that were in effect during the period of interest (aka real-time period).
https://api.stlouisfed.org/fred/series/vintagedates?series_id=GNPCA&api_key=123
<vintage_dates realtime_start="1776-07-04" realtime_end="9999-12-31" order_by="vintage_date" sort_order="asc" count="181" offset="0" limit="10000">
// snip
<vintage_date>2019-03-28</vintage_date>
<vintage_date>2019-07-26</vintage_date>
<vintage_date>2020-03-26</vintage_date>
// snip
</vintage_dates>
The first vintage in the list (2019-03-28) is included because it was in effect on the first day of the example real-time period (2019-05-01). Vintage 2020-03-26 was the last vintage to be released before the end of the example real-time period (2020-05-01).
These vintage dates give us an additional query parameter we can use:
vintage_dates=2019-03-28,2019-07-26,2020-03-26
Step 2: Use vintage dates to construct a query
The following query can now be constructed:
https://api.stlouisfed.org/fred/series/observations?series_id=GNPCA&vintage_dates=2019-03-28,2019-07-26,2020-03-26&observation_start=2018-01-01&observation_end=2018-01-01&output_type=3&api_key=123
<observations realtime_start="2019-03-28" realtime_end="2020-03-26" observation_start="2018-01-01" observation_end="2018-01-01" units="lin" output_type="3" file_type="xml" order_by="observation_date" sort_order="asc" count="1" offset="0" limit="100000">
<observation date="2018-01-01" GNPCA_20190328="18815.882" GNPCA_20190726="18897.8"/>
</observations>
The response above returns the correct vintage of 2019-03-28 which indicates when information was released that was in effect at the start of the example real-time period. The vintage when the value was revised is also correctly reported. You will need to deserialzie the xml or json by hand or write code to do it since most deserializers cannot parse this format into a statically defined object.