A hybrid cloud sports analytics platform built on a microservices architecture, deployed to a self-hosted Kubernetes cluster with Azure managed services.
Sports data aggregation and analysis platform that ingests data from multiple external sources (ESPN, CBS Sports, Yahoo!, sportsData.io), transforms it into canonical domain models, and exposes it through a unified API. Currently focused on NCAA Football with planned expansion to NFL, MLB, and PGA. ESPN is the active data provider; CBS, Yahoo!, and sportsData.io integrations are planned.
- Multi-source data aggregation - Unified interface for ESPN (active), CBS, Yahoo!, and sportsData.io (planned) via a provider/producer pipeline
- Event-driven architecture - RabbitMQ (local) / Azure Service Bus (production) messaging for decoupled domain communication
- Hybrid cloud deployment - Self-hosted Kubernetes on bare metal, leveraging Azure managed services for specific workloads
- Domain-driven design - Clear bounded contexts with dedicated HTTP clients per domain
- Microservices - Independent services with per-service databases, Dockerfiles, and deployment manifests
- Cost-optimized architecture - Pragmatic mix of on-premises compute and cloud-managed services
Backend: .NET 10, ASP.NET Core, Entity Framework Core Infrastructure: Kubernetes (on-prem), Docker, Azure (Service Bus, Cosmos DB, App Configuration, Key Vault, Static Web Apps, DevOps) Frontend: React (sd-ui), Expo/React Native (sd-mobile), TypeScript Auth: Firebase Authentication CI/CD: Azure Pipelines, GitHub Actions, GitOps Monitoring: OpenTelemetry, Seq, Prometheus Data: PostgreSQL, Redis, Cosmos DB, Azure Blob Storage
Each domain service runs independently with its own database and API. Cross-service communication uses dedicated HTTP clients (ContestClient, FranchiseClient, VenueClient, etc.) defined in the shared Core library.
Design Principles:
- Clear domain boundaries - Each domain has its own client interface and can be developed independently
- Configuration-driven routing - Client base URLs are configurable per environment
- Development velocity - Shared Core library reduces boilerplate while maintaining service independence
- Pragmatic evolution - Services scale independently based on actual needs, not speculation
All domain clients currently point to the Producer API. While ContestClient, FranchiseClient, and VenueClient are separate HTTP clients with distinct interfaces, they all resolve to the Producer service's API endpoints via configuration.
Migration path:
- Each client has its own configuration key (e.g.,
ContestClientConfig:ApiUrl,FranchiseClientConfig:ApiUrl) - Currently all configs point to the same Producer API URL
- When a domain needs independent scaling:
- Deploy that domain as a separate service
- Update the configuration to point to the new service URL
- No code changes required - the abstraction boundary already exists
External APIs (ESPN*) → Provider Service → Azure Blob Storage (raw JSON)
↓
Producer Service → Canonical Domain Models → PostgreSQL
↓
Message Bus (RabbitMQ / Azure Service Bus)
↓
Domain Services (Contest, Franchise, Venue, etc.)
↓
API Gateway → Clients (Web, Mobile)
| Project/Service | Purpose |
|---|---|
| core | Shared services, components, and middleware consumed by all services |
| api | API Gateway - unified entry point for all client applications. Firebase auth middleware. |
| contest | Games, scores, and statistics. Domain boundary via ContestClient. |
| franchise | Teams, rosters, and metadata. Domain boundary via FranchiseClient. |
| notification | User notifications and alerts. |
| player | Athlete profiles and statistics. |
| producer | Transforms external JSON into canonical domain objects. Publishes integration events via MassTransit. |
| provider | Ingests data from external sources (ESPN active; CBS, Yahoo!, sportsData.io planned). Stores raw JSON in Azure Blob Storage. Schedules sourcing runs via Hangfire. |
| season | Season schedules and calendars. |
| venue | Stadium and location data. Domain boundary via VenueClient. |
| jobs-dashboard | Hangfire dashboard for monitoring background job processing. |
| processor-gen | Source generator for DocumentProcessorBase implementations. |
| sd-ui | React web frontend (TypeScript). |
| sd-mobile | Expo/React Native mobile app (TypeScript). |
| Repository | Purpose |
|---|---|
| sports-data-core | This repository - application source code |
| sports-data-config | Kubernetes cluster definitions & GitOps configuration |
| sports-data-provision | Infrastructure as Code - Azure resource definitions |
High-Level Architecture
flowchart TD
PV[Provider]
BLOB[(Blob Storage)]
PV --> BLOB
PD[Producer]
PD <--> BLOB
MSG[RabbitMQ / Azure Service Bus]
N[Notification]
C[Contest]
S[Season]
V[Venue]
PL[Player]
FR[Franchise]
API[API Gateway]
AUTH[Firebase Auth]
API <--> AUTH
WCWEB[Web App]
WCMOB[Mobile App]
PV <--> ESPN[ESPN]
PV -.-> CBS[CBS]
PV -.-> YAHOO[Yahoo!]
PV -.-> SDIO[sportsData.io]
PV <--> MSG
PD <--> MSG
N <--> MSG
C <--> MSG
S <--> MSG
V <--> MSG
PL <--> MSG
FR <--> MSG
API --> C
API --> S
API --> V
API --> PL
API --> FR
API --> N
WCWEB --> API
WCMOB --> API
Detailed Service Architecture
flowchart BT
subgraph Provider
PV[svc]
PVDB[(DB)]
PV-->PVDB
PVAPI[API]
PV-->PVDB
PVAPI-->PVDB
end
BLOB[(Blob Storage)]
PV --> BLOB
subgraph Producer
PD[svc]
PDDB[(DB)]
PDAPI[API]
PD-->PDDB
PDAPI-->PDDB
end
PD <--> BLOB
M[RabbitMQ / Azure Service Bus]
subgraph Notification
N[svc]
NDB[(DB)]
NAPI[API]
end
subgraph Contest
C[svc]
CDB[(DB)]
CAPI[API]
end
subgraph Season
S[svc]
SDB[(DB)]
SAPI[API]
end
subgraph Venue
V[svc]
VDB[(DB)]
VPI[API]
end
subgraph Player
PL[svc]
PLDB[(DB)]
PLAPI[API]
end
subgraph Franchise
FR[svc]
FRDB[(DB)]
FRAPI[API]
end
API[API Gateway]
AUTH[Firebase Auth]
API --> AUTH
WCWEB[Web App]
WCMOB[Mobile App]
WCAPP[Code]
Provider --> ESPN[ESPN]
Provider -.-> CBS[CBS]
Provider -.-> YAHOO[Yahoo!]
Provider -.-> SDIO[sportsData.io]
PV-->M
PD-->M
N-->M
C-->M
S-->M
V-->M
PL-->M
FR-->M
API-->Provider
API-->Contest
API-->Season
API-->Venue
API-->Player
API-->Franchise
API-->Notification
WCWEB-->API
WCMOB-->API
WCAPP-->API
Architecture:
- Production - Self-hosted Kubernetes cluster on bare metal
- 4-node cluster: AMD Ryzen 5 7640HS (6-core/12-thread @ 5.0GHz), ~31.5GB usable RAM (BIOS tuned, iGPU allocation minimized), 1TB NVMe PCIe 4.0 SSD
- Cluster totals: 24 cores, 48 threads, ~126GB RAM, 4TB NVMe storage
- Dedicated PostgreSQL node: Same specs, isolated for database workload
- Dual 2.5GbE networking per node
- Azure Managed Services - Service Bus, Cosmos DB, App Configuration, Key Vault, Static Web Apps
- Hybrid approach - On-premises compute, Azure for managed services (born from Azure credit constraints, evolved into pragmatic architecture)
CI/CD Pipeline:
- GitHub Actions - PR validation, automated testing (mobile CI, code review)
- Azure Pipelines - Build, containerization, deployment
- GitOps - Kubernetes manifests managed in sports-data-config
Configuration:
- Azure App Configuration - centralized configuration management
- Azure Key Vault - secrets and sensitive configuration
- Environment-specific overrides via Azure DevOps pipelines
This project is licensed under the GNU General Public License v3.0 (GPL-3.0).
Any derivative works must also be open source and distributed under the same GPL-3.0 license. This ensures the community benefits from improvements and contributions.
Note: This is an active development project. Architecture and implementation details are subject to change as requirements evolve.