-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
User story / feature request
As a transit ridership data consumer,
I want to be able to have a single dataset of stop-level ridership data from multiple transit agencies across California,
So that I can quickly analyze ridership data from multiple transit agencies without having to obtain data from multiple agencies.
Acceptance Criteria
A release is produced that contains at a minimum the following files:
- An aggregated stop-level ridership dataset that includes at a minimum a single tabular file with a list of stops and ridership. Each record should have the following columns (matching column names in GTFS-RIDE ridership.txt where possible):
- A column with a primary key of dataset identifier plus stop ID (this column must have unique values for every row)
- A column with the dataset ID (cannot be empty)
- A column for the stop ID (from ridership dataset)
- A column for the GTFS stop ID (from matched GTFS dataset)
- A column of the stop name
- A column of the stop latitude
- A column of the stop longitude
- A column of average daily boarding (can be empty as long as the column with average total ridership is populated)
- A column of average daily alightings (can be empty as long as the column with average total ridership is populated)
- A column of average total ridership (can be empty as long as both the boardings and alightings column are populated)
- A column for start date (try to have the date range include some time in 2025)
- A column for end date
- A single file listing out dataset information with information about the datasets that were collected and joined together
- A column with the dataset ID (this column must have unique values for every row)
- A column with a name of the dataset (cannot be empty)
- A column with the organization name that provided the dataset (cannot be empty)
- A column with the year for which the data was collected in (collection time at the transit agency, cannot be empty)
- A column with any relevant notes (such as more information about the time period of data collection or any other nuances worth sharing, can be empty)
- A column with the URL for the GTFS dataset used for matching and obtaining stop lat/lon (only for ridership datasets without a lat/lon)
- A column with the date that the GTFS dataset was collected (only for ridership datasets without a lat/lon)
- A README-like document that has documentation about the meaning of each file and the meaning of the values in each column
Notes
We do not need to aggregate ridership data at transit stops that are at the same physical location but in different ridership datasets.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels