🎧 Spotify Data Pipeline using AWS & Snowflake

This project builds a scalable and automated ETL pipeline that extracts playlist data from the Spotify API, transforms it using AWS Glue (Apache Spark), and loads it into Snowflake using Snowpipe. The data is finally visualized using Power BI to generate meaningful insights.

📌 Project Overview

Objective: Automate the extraction, transformation, and loading of Spotify playlist data.
Architecture: Event-driven, serverless, and cloud-native.
Outcome: A production-ready data pipeline that supports real-time ingestion and dashboarding.

🚀 Tech Stack

Component	Tool/Service
Data Source	Spotify Web API
Extraction	AWS Lambda + CloudWatch
Storage	Amazon S3
Transformation	AWS Glue (Apache Spark)
Data Loading	Snowpipe (Snowflake)
Dashboard	Power BI

🔄 Workflow

1. Extract

A Lambda function fetches playlist data from the Spotify API.
The job is scheduled using CloudWatch Events to run every 5 minutes.
Raw JSON data is saved to an S3 folder: s3://<bucket>/raw_data/to_processed/.

2. Transform

An AWS Glue Job (Spark) picks up the raw data from S3.
Data is cleaned and split into structured tables: Album, Artist, Song.
Transformed data is saved in:

3. Load

Transformed files trigger Snowpipe using S3 event notifications (via SQS).
Data is loaded into Snowflake tables automatically.
Storage integration is used to securely connect Snowflake with S3.

4. Visualize

Data from Snowflake is imported into Power BI.
Dashboards provide real-time insights into song trends, artist performance, and more.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Architecture.jpeg		Architecture.jpeg
README.md		README.md
Snowpipe.sql		Snowpipe.sql
Spotify Dashboard new.pbix		Spotify Dashboard new.pbix
Spotify PowerBI.pdf		Spotify PowerBI.pdf
spotify_api_data_extract.py		spotify_api_data_extract.py
spotify_transformation(glue).py		spotify_transformation(glue).py
spotify_transformation_job (1).ipynb		spotify_transformation_job (1).ipynb
spotify_transformation_load_function.py		spotify_transformation_load_function.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎧 Spotify Data Pipeline using AWS & Snowflake

📌 Project Overview

🚀 Tech Stack

🔄 Workflow

1. Extract

2. Transform

3. Load

4. Visualize

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎧 Spotify Data Pipeline using AWS & Snowflake

📌 Project Overview

🚀 Tech Stack

🔄 Workflow

1. Extract

2. Transform

3. Load

4. Visualize

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages