A production-style RESTful API that ingests real-world GTFS transit data from Transport for NSW and exposes it via a secure, role-based, documented backend service.
GTFS: https://gtfs.org/getting-started/what-is-GTFS/
Built with Flask-RESTX, JWT authentication, and SQLite, this system supports transit planners, commuters, and administrators with real-world data ingestion, querying, visualisation, and export capabilities.
- Flask-RESTX API layer (Swagger-first design)
- JWT-based authentication & role-based authorization
- GTFS ingestion pipeline (ZIP → CSV → SQLite)
- Read-optimized relational schema
- Visualisation & export layer (PNG maps, CSV)
- python3.13 -m venv .venv
- source .venv/bin/activate
- Python 3.13
- Flask-RESTX — REST API + Swagger documentation
- SQLite — lightweight runtime persistence
- GTFS Schedule API (Transport for NSW)
- RapidFuzz — fuzzy string matching
- pandas / matplotlib — data processing & visualisation
- pip install -r requirements.txt
- Create your own API key from Transport for NSW Open Data
- transport_api_key.env: TRANSPORT_API_KEY=your_key_here
- Run the API: python z5494973.py
- Swagger UI available at root (
/) - Fully interactive
- Includes:
- request/response schemas
- role-based access notes
- error cases
- Always synchronized with implementation
The system bootstraps with three users on first run:
| Username | Password | Role |
|---|---|---|
| admin | admin | Admin |
| planner | planner | Planner |
| commuter | commuter | Commuter |
- Admin
- Full system access
- User management: Create, delete, activate, and deactivate users
- GTFS imports
- Planner
- Import and manage GTFS data
- Full read/write access to transit data
- Commuter
- Read-only access to transit data
- Manage personal favourites and visualisations
- All protected endpoints require a JWT token
- Tokens are issued via: POST /auth/login
- Token must be sent in request headers: AUTH-TOKEN: <jwt_token>
- Signed using HS256
- Contains:
usernamerole(Admin | Planner | Commuter)exp(1 hour expiry)
- Token validity is checked per request
- User existence and enabled status are re-validated against the database
POST /gtfs/datasets/{dataset_id}
- Imports GTFS Schedule data directly from Transport for NSW
- Supports Sydney Metro bus agencies (
GSBC*,SBSC*) - Parses
.zipGTFS datasets into structured SQLite tables - Existing data is fully replaced on re-import
- Enforces “data must be imported before use” guarantees
agencyroutescalendarcalendar_datestripsstopsstop_timesshapesnotes
GET /gtfs/datasets Returns:
- active dataset ID
- agency name
- import timestamp
- per-table row counts
All users can:
- Retrieve routes, trips, and stops by ID
- Browse:
- routes for an agency
- trips for a route
- stops for a trip
- Navigate large datasets using REST-friendly pagination patterns
GET /routes GET /routes/{route_id}
- Paginated route listing
- Supports large datasets safely
- Deterministic ordering
GET /trips?route_id=... GET /trips/{trip_id}
- Lists all trips for a route
- Pagination supported
- Validates route existence
GET /stops?trip_id=... GET /stops/{stop_id}
- Stops are returned in correct sequence order
- Joined with stop metadata (lat/lon, accessibility)
GET /stops/search?query=...
-
Case-insensitive and partial text matching
-
Partial matching (
"Quay","circula") -
Powered by RapidFuzz for fast, fuzzy search
-
Configurable:
- match threshold
- result limits
- trips per stop
-
Returns:
- matching stops
- associated routes and trips
-
Handles zero-result queries gracefully
Each user may store up to 2 favourite routes.
- Each user can manage their own favourites
- Supports add / update / delete operations
- PUT /favourites/routes/{route_id}
- DELETE /favourites/routes/{route_id}
- POST /favourites/routes
- GET /favourites/routes
- Route data is denormalised
- Favourites survive dataset re-imports
- Enforced limits at API + DB level
GET /favourites/routes/maps
- Generates a PNG map (no file writes) for favourite routes
- Distinct colours per route + headsign
- Uses real GTFS
shapes.txtgeometry - Rendered server-side using Matplotlib
- Inline browser-renderable image
- Includes metadata headers:
- trip count
- shape count
- headsign count
- route IDs & names
GET /favourites/routes/csv
- Dataset ID
- Route metadata
- Trip headsigns
- Shape geometry (lat/lon sequence)
- Sorted deterministically for downstream analysis
Designed for:
- GIS tools
- Data science pipelines
- External analytics
- Default page size with hard max cap
- Consistent pagination metadata: page, total, total_pages, has_prev / has_next
- Prevents accidental large scans
This system mirrors real backend systems used in transport platforms, data infrastructure, and public-sector APIs:
- Secure Restful API design (JWT + RBAC)
- Clean, resource-oriented endpoint design
- Correct HTTP verbs and status codes
- Clear separation of concerns
- Real-world data ingestion pipelines
- Designed to scale to large datasets
- Defensive API engineering (input validation and error handling)
- Non-trivial visualisation
- Production-quality documentation
- Automated tests validation of endpoints: z5494973_tests.py
- Further works includes adding rate limiting and a recommendation system for when best to leave
For demonstration and educational purposes only

