PollutionMap

PollutionMap is a distributed data processing pipeline that processes air quality readings from sensors across multiple cities in parallel, producing a city-level pollution summary.

The air quality dataset is submitted to a central scheduler, which splits it into chunks and distributes them across multiple worker nodes that process them in parallel. This scheduler-worker approach scales horizontally: adding more workers reduces processing time proportionally. For a small dataset the difference is negligible, but for a dataset with millions of sensor readings from thousands of cities, splitting the work across multiple workers can be many times faster than running a single sequential script.

🎬 Demo

⚙️ How it works

A client submits a dataset to the scheduler
The scheduler partitions it into chunks and pushes tasks to a Redis queue
Multiple workers poll the queue and process their assigned chunks in parallel
Results are aggregated into a final output by a reduce stage

🛠️ Tech Stack

Python + FastAPI — scheduler API
Redis — task queue and distributed state
Docker — each worker runs in its own container, making it easy to scale horizontally
HTTP REST — communication between client, scheduler, and workers

🚀 Running Locally

Prerequisites: Docker must be installed and running.

Start the scheduler and 3 workers:

docker compose up --scale worker=3

Submit a job:

PYTHONPATH=. python -m client.submit_job dataset/sample_dataset.json 3

Check live metrics (active workers, pending and running tasks):

curl http://localhost:8000/metrics

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
client		client
dataset		dataset
demo		demo
scheduler		scheduler
tasks		tasks
worker		worker
.gitignore		.gitignore
Dockerfile.scheduler		Dockerfile.scheduler
Dockerfile.worker		Dockerfile.worker
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PollutionMap

🎬 Demo

⚙️ How it works

🛠️ Tech Stack

🚀 Running Locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PollutionMap

🎬 Demo

⚙️ How it works

🛠️ Tech Stack

🚀 Running Locally

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages