Skip to content
/ CarsDash Public

An efficient data pipeline for scraping and extracting car listings from Moroccan used car marketplaces, storing the results in an AWS RDS PostgreSQL database, paired with a dynamic dashboard for visualizing key statistics, trends, and market insights.

Notifications You must be signed in to change notification settings

L1xus/CarsDash

Repository files navigation

Used Cars Market end-to-end Project

This repository contains a personal project focused on building an end-to-end data pipeline for used car market in Morocco. The pipeline automates data collection, cleaning, transformation, storage in an AWS RDS PostgreSQL database, and visualization through Grafana dashboard.

Important

Feel free to submit a PR request if you believe any changes are necessary...

Table of Content

Architecture diagram

Architecture

Overview

1. Data Scraping

The data scraping phase is managed by scripts in avito/ and moteur/

  • Web Scraping: Used Python's BeautifulSoup to extract data from the two used car marketplaces.
  • Attributes Collected: Scraped key attributes such as price, car_model, car_company, year, km, and more...

2. Data Cleaning & Transformation

The data cleaning and transformation phase is handled by functions in db/

  • Data Cleaning: Remove null car_companies, normalize car_company and car_model, and delete inappropriate data using SQL.
  • Transformation: Structure data for relational storage and ensures compatibility with PostgreSQL schema.

3. Data Storage

The data storage phase is executed by insert script in db/

  • AWS RDS Integration: Load cleaned and structured data into an AWS RDS PostgreSQL database.

4. Data Visualization

The data visualization phase is implemented through grafana/

  • Grafana Integration: Configure interactive dashboard to visualize market trends, pricing... for every car_company.
  • Grafana Cloud: Leverage cloud-hosted service for dynamic, real-time data exploration.

dashboard_01 dashboard_02

Tech Stack

Python badge Amazon Cloud badge Docker badge Grafana badge

Prerequisites

Run the project

  1. Clone the repository:
git clone https://github.com/L1xus/CarsDash.git
cd CarsDash
  1. Run docker-compose docker-compose up --build
  2. Create the dashboard
  cd grafana
  docker-compose up --build
  1. Import the Cars Dashboard json file into grafana

Future Enhancements

  • Add support for filtering by car_model and year.
  • Automate the process every month.
  • Enhance data quality by creating a machine learning model to filter car listings.
  • Consider a better option to make the dashboard public.

About

An efficient data pipeline for scraping and extracting car listings from Moroccan used car marketplaces, storing the results in an AWS RDS PostgreSQL database, paired with a dynamic dashboard for visualizing key statistics, trends, and market insights.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published