🏙️ Kansas City 311 Insight Engine

Academic Project | Principles of Big Data Management Course
An automated data pipeline for analyzing Kansas City 311 service requests

🚀 View Live Dashboard → | 📖 Dashboard Guide

📋 Overview

This project implements an end-to-end data engineering pipeline that extracts, transforms, and loads (ETL) civic service request data from Kansas City's Open Data Portal into a cloud data warehouse for analysis and visualization.

Key Objectives:

Demonstrate proficiency in cloud-based data engineering
Automate data ingestion using CI/CD practices
Enable real-time municipal service analytics

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                        KC 311 DATA PIPELINE ARCHITECTURE                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   ┌──────────────┐      ┌──────────────┐      ┌──────────────┐              │
│   │   KC Open    │      │   Python/    │      │   Google     │              │
│   │   Data API   │ ───▶ │   Pandas     │ ───▶ │   BigQuery   │              │
│   │   (Socrata)  │      │   ETL        │      │   (DWH)      │              │
│   └──────────────┘      └──────────────┘      └──────────────┘              │
│                               │                      │                      │
│                               │                      │                      │
│                               ▼                      ▼                      │
│                        ┌──────────────┐      ┌──────────────┐               │
│                        │   GitHub     │      │   Tableau    │               │
│                        │   Actions    │      │   Public     │               │
│                        │   (Scheduler)│      │   (Viz)      │               │
│                        └──────────────┘      └──────────────┘               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

✨ Features

Feature	Description
🔄 Automated Ingestion	GitHub Actions runs the pipeline on a configurable schedule (weekly by default)
🧹 Data Quality	Automated deduplication prevents duplicate records
☁️ Cloud-Native	Leverages Google BigQuery for scalable analytics
📊 Visual Analytics	Interactive Tableau dashboard for insights
🔐 Secure Credentials	GCP service account keys managed via GitHub Secrets

📊 Dashboard

The interactive Tableau dashboard provides two views:

Overview Page

Analysis Page

🔗 View Live Dashboard on Tableau Public

🛠️ Technology Stack

Layer	Technology
Data Source	Kansas City Open Data (Socrata API)
Processing	Python 3.9+, Pandas
Data Warehouse	Google BigQuery
Orchestration	GitHub Actions
Visualization	Tableau Public

🚀 Getting Started

Prerequisites

Python 3.9 or higher
Google Cloud Platform account with BigQuery enabled
GCP Service Account with BigQuery permissions

Installation

Clone the repository

git clone https://github.com/YOUR_USERNAME/311-KC-Dashboard.git
cd 311-KC-Dashboard

Create a virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```

Configure environment variables

cp .env.example .env
# Edit .env with your GCP project details

Add your GCP credentials
- Download your service account JSON key from GCP Console
- Save it as gbq_key.json in the project root
- ⚠️ Never commit this file (it's in .gitignore)

Running Locally

cd src
python pipeline.py

📁 Project Structure

311-KC-Dashboard/
├── README.md               # This file
├── requirements.txt        # Python dependencies
├── .env.example            # Environment variable template
├── .gitignore              # Git ignore rules
│
├── src/                    # Source code
│   ├── __init__.py
│   ├── config.py           # Configuration loader
│   └── pipeline.py         # Main ETL script
│
├── docs/                   # Documentation & visuals
│   ├── architecture.png    # Architecture diagram
│   └── dashboard_preview.png
│
└── .github/
    └── workflows/
        └── hourly_scheduler.yml  # GitHub Actions workflow

⚙️ GitHub Actions Configuration

The pipeline runs automatically via GitHub Actions. To configure:

Add Repository Secret
- Go to: Settings → Secrets → Actions
- Add secret named GCP_SA_KEY containing your service account JSON
Adjust Schedule (optional)
- Edit .github/workflows/hourly_scheduler.yml
- Modify the cron expression as needed

📈 Data Schema

Column	Type	Description
`issue_id`	STRING	Unique identifier for each request
`current_status`	STRING	Status (open, closed, etc.)
`category`	STRING	Service category
`open_date_time_ymd`	DATE	Request creation date
`resolved_date_ymd`	DATE	Resolution date (if applicable)
`ingest_timestamp`	INTEGER	Pipeline run timestamp

👨‍💻 Author

Nithin Reddy
Graduate Student | Data Engineering Principles of Big Data Management Course Project

📄 License

This project is for educational purposes as part of academic coursework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏙️ Kansas City 311 Insight Engine

🚀 View Live Dashboard → | 📖 Dashboard Guide

📋 Overview

🏗️ Architecture

✨ Features

📊 Dashboard

Overview Page

Analysis Page

🛠️ Technology Stack

🚀 Getting Started

Prerequisites

Installation

Running Locally

📁 Project Structure

⚙️ GitHub Actions Configuration

📈 Data Schema

👨‍💻 Author

Nithin Reddy
Graduate Student | Data Engineering Principles of Big Data Management Course Project

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
docs		docs
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🏙️ Kansas City 311 Insight Engine

🚀 View Live Dashboard → | 📖 Dashboard Guide

📋 Overview

🏗️ Architecture

✨ Features

📊 Dashboard

Overview Page

Analysis Page

🛠️ Technology Stack

🚀 Getting Started

Prerequisites

Installation

Running Locally

📁 Project Structure

⚙️ GitHub Actions Configuration

📈 Data Schema

👨‍💻 Author

Nithin Reddy Graduate Student | Data Engineering Principles of Big Data Management Course Project

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Nithin Reddy
Graduate Student | Data Engineering Principles of Big Data Management Course Project

Packages