Skip to content

Commit 91fccdb

Browse files
committed
add statistics and remove the old ones
1 parent 9e06931 commit 91fccdb

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+1778207
-3082
lines changed

.github/workflows/monthly_data_release.yml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,11 @@ jobs:
1313
steps:
1414
- uses: actions/checkout@v3
1515

16-
- name: Setup uv
17-
run: |
18-
curl -LsSf https://astral.sh/uv/install.sh | sh
19-
uv sync --python 3.13
16+
- name: Install uv
17+
uses: astral-sh/setup-uv@v7
18+
19+
- name: Install the project
20+
run: uv sync --python 3.13
2021

2122
- name: Download raw data from Hugging Face
2223
env:
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
name: Update Statistics
2+
3+
on:
4+
schedule:
5+
# Runs at 12:00 UTC on the 3rd day of every month
6+
- cron: '0 12 3 * *'
7+
workflow_dispatch:
8+
9+
jobs:
10+
update-statistics:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- name: Checkout Code
14+
uses: actions/checkout@v4
15+
16+
- name: Install uv
17+
uses: astral-sh/setup-uv@v7
18+
19+
- name: Install the project
20+
run: uv sync --python 3.13
21+
22+
- name: Download Monthly Processed Data (Last 3 Months)
23+
run: |
24+
REPO_ID="piebro/deutsche-bahn-data"
25+
26+
# Calculate the last 3 months
27+
MONTH_1=$(date -d "last month" +"%Y-%m")
28+
MONTH_2=$(date -d "2 months ago" +"%Y-%m")
29+
MONTH_3=$(date -d "3 months ago" +"%Y-%m")
30+
31+
echo "Downloading monthly processed data for: $MONTH_3, $MONTH_2, $MONTH_1"
32+
33+
uv run --with huggingface_hub hf download "$REPO_ID" \
34+
--repo-type=dataset \
35+
--include "monthly_processed_data/data-$MONTH_1.parquet" \
36+
--include "monthly_processed_data/data-$MONTH_2.parquet" \
37+
--include "monthly_processed_data/data-$MONTH_3.parquet" \
38+
--local-dir .
39+
40+
echo "Download complete!"
41+
ls -la monthly_processed_data/
42+
43+
- name: Run Notebooks and Generate HTML
44+
run: uv run python notebooks/src/nb_to_html.py --run
45+
46+
- name: Commit and Push
47+
run: |
48+
git config --local user.email "noreply@github.com"
49+
git config --local user.name "GitHub Actions Bot"
50+
git add stats/ notebooks/
51+
git diff --staged --quiet || git commit -m "Monthly statistics update $(date +'%Y-%m')"
52+
git push

README.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
# Deutsche Bahn Data
22

33
This project saves public historical data from "Deutsche Bahn", the biggest german train company and makes it [accessible](https://huggingface.co/datasets/piebro/deutsche-bahn-data) for everyone to use.
4-
It includes train schedules, delays, and cancellations from stations across Germany.
4+
It includes train schedules, delays, and cancellations from stations across Germany.
5+
6+
There is also a small website to show same stat about the Deutsche Bahn: [piebro.github.io/deutsche-bahn-data](https://piebro.github.io/deutsche-bahn-data/stats/allgemein.html)
57

68
The data can be used to validate the [official statistics](https://www.deutschebahn.com/de/konzern/konzernprofil/zahlen_fakten/puenktlichkeitswerte-6878476) and to create many other statistics.
79

@@ -84,14 +86,38 @@ uv sync --python 3.13
8486
uv run pre-commit install
8587
```
8688

89+
## Generating HTML from Notebooks
90+
91+
```bash
92+
uv run python notebooks/src/nb_to_html.py # Convert all notebooks (no execution)
93+
uv run python notebooks/src/nb_to_html.py --run # Run and convert all notebooks
94+
uv run python notebooks/src/nb_to_html.py --run allgemein # Run only allgemein, convert all
95+
```
96+
8797
## Contributing
8898

8999
Contributions are welcome. Open an Issue if you want to report a bug, have an idea or want to propose a change.
90100

101+
## Related Deutsche Bahn and Open Data Websites
102+
103+
There are a few other projects that look at similar data.
104+
- [Video](https://www.youtube.com/watch?v=0rb9CfOvojk): BahnMining - Pünktlichkeit ist eine Zier (David Kriesel) [2019]
105+
- [www.deutschebahn.com](https://www.deutschebahn.com/de/konzern/konzernprofil/zahlen_fakten/puenktlichkeitswerte-6878476#): official statistics from Deutsche Bahn
106+
- [bahn.expert](https://bahn.expert): look at the departure monitor of train stations in real time
107+
- [next.bahnvorhersage.de](https://next.bahnvorhersage.de): a tool to calculate the probability that a train connection works using historical data
108+
- [www.zugfinder.net](https://www.zugfinder.net/de/start): multiple maps of current train positions and statistics for long-distance trains in Germany, Austria, BeNeLux, Denmark, Italy and Slovenia
109+
- [strecken-info.de](https://strecken-info.de/): a map of the German railroads with current construction sites and disruptions on the routes
110+
- [openrailwaymap.org](https://openrailwaymap.org/): a worldwide map with railway infrastructure using OpenStreetMap Data
111+
- [zugspaet.de](https://zugspaet.de): a website, where you can then enter your train and see how often it was late or on time in the past
112+
91113
## License
92114

93115
All code in this project is licensed under the MIT License. The [data](https://huggingface.co/datasets/piebro/deutsche-bahn-data) is licensed under [Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/) by Deutsche Bahn.
94116

117+
## Disclaimer
118+
119+
This website is developed by Piet Brömmel. It has no affiliation with Deutsche Bahn or any other transportation company. This website is my personal project and everything stated here is provided without warranty, but is maintained by me to the best of my ability.
120+
95121
## Acknowledgments
96122

97123
Data sourced from Deutsche Bahn's public APIs. Special thanks to Deutsche Bahn for providing open access to this data.

index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<!DOCTYPE html>
22
<html>
33
<head>
4-
<meta http-equiv="refresh" content="0; url=stats/übersicht.html">
4+
<meta http-equiv="refresh" content="0; url=stats/allgemein.html">
55
</head>
66
<body>
77
</body>

0 commit comments

Comments
 (0)