Skip to content

Commit 3d7106d

Browse files
qunderrinerFinn Roberts
andauthored
add geographic_instances to time series (#103)
* add geographic_instances to time series * fixed geographic_instances docstring * Update geog extent docs for TSTs * Black --------- Co-authored-by: Finn Roberts <robe2037@umn.edu>
1 parent 1791bf9 commit 3d7106d

File tree

4 files changed

+46
-37
lines changed

4 files changed

+46
-37
lines changed

docs/source/change-log.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,13 @@ This project adheres to `Semantic Versioning`_.
1010

1111
.. _Semantic Versioning: http://semver.org/
1212

13+
0.6.1
14+
-----
15+
16+
* The IPUMS API now supports geographic extent selection for NHGIS time series tables.
17+
Accordingly, a `geographic_instances` attribute has been added to the
18+
:py:class:`~ipumspy.api.metadata.TimeSeriesTableMetadata` class to store retrieved
19+
geographic extent metadata for a given time series table.
1320

1421
0.6.0
1522
-----

docs/source/ipums_api/ipums_api_aggregate/index.rst

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -151,37 +151,6 @@ The returned object will contain the metadata for the requested dataset. For exa
151151
152152
You can also request metadata for individual data tables using the same workflow with the :class:`DataTableMetadata <ipumspy.api.metadata.DataTableMetadata>` data class.
153153

154-
Geographic Extent Selection
155-
+++++++++++++++++++++++++++
156-
157-
When working with small geographies it can be computationally intensive to work with
158-
nationwide data. To avoid this problem, you can request data from a specific geographic area
159-
using the ``geographic_extents`` argument.
160-
161-
The following extract requests ACS 5-year sex-by-age counts at the census block group level, but
162-
only includes block groups that fall within Alabama and Arkansas (identified by their FIPS codes with
163-
a trailing 0):
164-
165-
.. code:: python
166-
167-
extract = AggregateDataExtract(
168-
collection="nhgis",
169-
description="Extent selection example",
170-
datasets=[
171-
Dataset(name="2018_2022_ACS5a", data_tables=["B01001"], geog_levels=["blck_grp"]),
172-
Dataset(name="2017_2021_ACS5a", data_tables=["B01001"], geog_levels=["blck_grp"])
173-
],
174-
geographic_extents=["010", "050"]
175-
)
176-
177-
.. tip::
178-
You can see available extent selection API codes, if any, in the ``geographic_instances`` attribute of
179-
a submitted :class:`DatasetMetadata <ipumspy.api.metadata.DatasetMetadata>` object.
180-
181-
Note that extent selection is *not* a dataset-specific parameter. That is, the selected extents
182-
are applied to all datasets in the extract. It is not possible to request different extents for different
183-
datasets in a single extract.
184-
185154
Time Series Tables
186155
------------------
187156

@@ -237,6 +206,37 @@ into separate files (by default, time is arranged across columns).
237206
As with datasets and data tables, you can request metadata about the available specification options
238207
for a specific time series table using the :class:`TimeSeriesTableMetadata <ipumspy.api.metadata.TimeSeriesTableMetadata>` class.
239208

209+
Geographic Extent Selection
210+
---------------------------
211+
212+
When working with small geographies it can be computationally intensive to work with
213+
nationwide data. To avoid this problem, you can request data from a specific geographic area
214+
using the ``geographic_extents`` argument
215+
216+
The following extract requests ACS 5-year sex-by-age counts at the census block group level, but
217+
only includes block groups that fall within Alabama and Arkansas (identified by their FIPS codes with
218+
a trailing 0):
219+
220+
.. code:: python
221+
222+
extract = AggregateDataExtract(
223+
collection="nhgis",
224+
description="Extent selection example",
225+
datasets=[
226+
Dataset(name="2018_2022_ACS5a", data_tables=["B01001"], geog_levels=["blck_grp"]),
227+
Dataset(name="2017_2021_ACS5a", data_tables=["B01001"], geog_levels=["blck_grp"])
228+
],
229+
geographic_extents=["010", "050"]
230+
)
231+
232+
.. tip::
233+
You can see available extent selection API codes, if any, in the ``geographic_instances`` attribute of
234+
a submitted :class:`DatasetMetadata <ipumspy.api.metadata.DatasetMetadata>` or
235+
:class:`TimeSeriesTableMetadata <ipumspy.api.metadata.TimeSeriesTableMetadata>` object.
236+
237+
Note that the selected extents are applied to all datasets and time series tables in an extract.
238+
It is not possible to request different extents for different data sources in a single extract.
239+
240240
Shapefiles
241241
----------
242242

@@ -317,7 +317,7 @@ Many NHGIS supplemental data files can be found under the "Supplemental Data" he
317317
for all supported supplemental data endpoints and advice on how to convert file URLs found on the website into
318318
acceptable API request URLs.
319319

320-
Once you've identified a file's location, you can use the``ipumspy`` :py:meth:`.get` method to download it. For
320+
Once you've identified a file's location, you can use the ipumspy :py:meth:`.get` method to download it. For
321321
instance, to download a state-level NHGIS crosswalk file, we could use the following:
322322

323323
.. code:: python

src/ipumspy/api/extract.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -712,9 +712,9 @@ def __init__(
712712
shapefiles: list of shapefile names
713713
description: short description of your extract
714714
data_format: desired format of the extract data file. One of ``"csv_no_header"``, ``"csv_header"``, or ``"fixed_width"``.
715-
geographic_extents: Geographic extents to use for all ``datasets`` in the extract definition (for instance, to
716-
to obtain data within a particular state). Use ``['*']`` to select all available extents. Note that
717-
not all geographic levels support extent selection.
715+
geographic_extents: Geographic extents to use for all ``datasets`` and ``time_series_tables`` in the extract definition (for instance, to
716+
to obtain data within a particular state). Note that geographic extent selection is only supported for geographies
717+
that nest within states.
718718
tst_layout: desired data layout for all ``time_series_tables`` in the extract definition.
719719
One of ``"time_by_column_layout"`` (default), ``"time_by_row_layout"``, or ``"time_by_file_layout"``.
720720
breakdown_and_data_type_layout: desired layout of any ``datasets`` that have multiple data types or breakdown values. Either

src/ipumspy/api/metadata.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,8 @@ class DatasetMetadata(IpumsMetadata):
8181
"""
8282
geographic_instances: Optional[List[Dict]] = field(default=None, init=False)
8383
"""
84-
Dictionary containing names and descriptions for all valid geographic extents
85-
for the dataset
84+
Dictionary containing names and descriptions for the geographic extents available for the
85+
dataset, if any
8686
"""
8787
breakdowns: Optional[List[Dict]] = field(default=None, init=False)
8888
"""
@@ -129,6 +129,8 @@ class TimeSeriesTableMetadata(IpumsMetadata):
129129
"""Dictionary containing information on the available data years for the time series table"""
130130
geog_levels: Optional[List[Dict]] = field(default=None, init=False)
131131
"""Dictionary containing names and descriptions for the geographic levels available for the time series table"""
132+
geographic_instances: Optional[List[Dict]] = field(default=None, init=False)
133+
"""Dictionary containing names and descriptions for all valid geographic extents available for any year in the time series table"""
132134

133135
def __post_init__(self):
134136
self._path = f"metadata/time_series_tables/{self.name}"

0 commit comments

Comments
 (0)