Prior Work in this Space #3

AndiLi99 · 2024-08-16T01:40:17Z

AndiLi99
Aug 16, 2024

As part of researching for this project, I started looking at prior work in the gtfs transit data space. Firstly, the gtfs documentation has a great list of resources to start looking at.

One resource stuck out to me: Transitland is an open-data platform built on thousands of public-transit data feeds from around the world. Transitland is the largest and most feature-rich aggregator of GTFS, GTFS Realtime, and GBFS data feeds. They are operated by Interline. They do already offer free GTFS archive downloads for hobbyists / academics here. Here's an example feed with historical archives. Now it's unclear if they only archive static gtfs data, or if they also archive gtfs-rt data. For example the MTA realtime bus data feed does not list archived data. Perhaps they do archive the data but downloading is only available for their paid users. In any case, simply creating a catalogue of gtfs feeds from around the world is useful enough. They also have a git repository of tools for handling gtfs and gtfs-rt data.

One challenge they highlight in offering free transit archive data downloads is the cost of bandwidth. They write "Some users are taking advantage of this openness by scraping the entire contents of Transitland's feed version archive. These mass downloads put load on Transitland servers and bandwidth, increasing our operating expenses." For this project, an idea to take the load off of hosting costs could be to host the datasets for free on machine learning dataset websites like kaggle or huggingface. We could publish an update once a month.

This could limit the cost of maintaining the archival to only the storage costs internally, without paying for bandwidth for others to download the data.

AndiLi99 · 2024-08-16T01:43:33Z

AndiLi99
Aug 16, 2024
Author

More catalogues for feeds. So far I haven't seen too many examples of archiving gtfs rt data.

0 replies

AndiLi99 · 2024-08-16T01:44:50Z

AndiLi99
Aug 16, 2024
Author

Some people have used github + github actions to scrape and store data as a git repository.
Examples: https://github.com/jackharrhy/metrobus-timetrack-history https://github.com/drzax/queensland-traffic-conditions https://news.ycombinator.com/item?id=37082289

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prior Work in this Space #3

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Prior Work in this Space #3

Uh oh!

AndiLi99 Aug 16, 2024

Replies: 2 comments

Uh oh!

AndiLi99 Aug 16, 2024 Author

Uh oh!

AndiLi99 Aug 16, 2024 Author

AndiLi99
Aug 16, 2024

AndiLi99
Aug 16, 2024
Author

AndiLi99
Aug 16, 2024
Author