-
Notifications
You must be signed in to change notification settings - Fork 3
About Us
The Department of City Planning (DCP) is New York City’s primary land use agency and is instrumental in designing the City’s physical and socioeconomic framework. DCP’s ambition is to make all of New York a better place to live, to maintain what works, and improve what doesn’t.
Data Engineering (DE) is a team within DCP’s Geographic Data and Engineering (GDE) group. We design, produce, and publish data using open-source technology and cloud infrastructure.
People need good data to plan for the future of New York City. As the Data Engineering team at the Department of City Planning, we create and publish data to inform decisions and public policy, both within and beyond the agency.
We think that we spend more time than we should on routine data operations, that people would like their data work to be modernized and more centralized, and that high-impact planning work is hindered by the current state of data complexity, usability, and access.
We want to minimize the time and effort spent on data updates, empower people to do advanced data analytics, and facilitate agency and citywide initiatives.
We plan to simplify, modernize, and expand DCP’s data operations and analytics.
To describe how we work and what we make, these are some of the most important principles to our team.
- We have fun.
- We help each other.
- We are ambitious.
- We are a part of urban planning.
- Good data is useful.
- Good data is understandable.
- Good data is reproducible.
- Good code is readable.
- Good code is easy to change.
- Good code is as little code as possible.
Before the Data Egnineering team was created, various complex datasets were produced, published, and maintained by different technologists, planners and analysts throughout the agency. In 2018, it was argued and decided that a team was needed specifically for maintaining infrastructure around transforming data, and Data Engineering at DCP began.
The data we produce are used by planners and civic technologists alike in analyses, which help inform decisions that ultimately shape NYC. Our data products also power downstream applications, such as NYC’s Zoning and Land Use Map, used by planners, policy makers, and the public. Therefore, it’s imperative that we publish data and supporting documentation of the highest quality. At this point, we produce 11 primary datasets (or rather data products), and the production, QA, and distribution of these products are our primary focus and responsibility.
While production of data is our primary responsibility, that breaks down into some slightly more discrete tasks and responsibilities of DE. On the technical side of things, we are responsible for
- Maintenance/creation of infrastructure
- for long-term storage of data, both produced by DE and ingested from external sources
- for transformation of data
- for QA of data, codifying knowledge of domain experts and making it easier to explore datasets for irregularities
- for packaging and distribution of data products
- Operations
- building of data products
- running internal QA/facilitating QA by stakeholders
- packaging and distributing data products
At a slightly higher level, our mission breaks down into four categories,
Create and release high quality public datasets about NYC
Build highly transparent and automated data pipelines using open source technologies
Offer more than just data, but also comprehensive documentation and metadata
Bring people together across teams and agencies to share data and to learn from each other