Skip to content

code-with-qasim/data_engineer_dbt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data_engineer_dbt

Learn and understand the basics of being a data engineer and work with DBT.

Please remember to add your local db details in profiles.yml file. You can also setup the dbt for BigQuery, SnowFlake, RedShift and DataBricks.

Project Overview

This project demonstrates how to use dbt (Data Build Tool) for data engineering workflows, including setting up a local development environment, seeding raw data, and building views in a local Postgres database.


Prerequisites

  • Python 3.7+ (recommended: Python 3.11)
  • PostgreSQL (local instance running and accessible)
  • pip (Python package manager)

Setup Instructions

1. Create and Activate a Python Virtual Environment

A Python virtual environment is already present in the dbt_env/ directory. If you need to recreate it, run:

python3 -m venv dbt_env

Activate the virtual environment:

  • macOS/Linux:
    source dbt_env/bin/activate
  • Windows:
    dbt_env\Scripts\activate

You should see (dbt_env) in your terminal prompt when the environment is active.


2. Install dbt and Required Packages

With the virtual environment activated, install dbt and the Postgres adapter:

pip install dbt-core dbt-postgres

3. Configure dbt

  • Edit your profiles.yml (usually in ~/.dbt/profiles.yml) to point to your local Postgres database.
  • Update connection details as needed for your environment.

4. Seed Raw Data

To load the raw data from CSV files in jaffle_shop/seeds/ into your Postgres database, run:

dbt seed

This will insert the data into your database as tables.


5. Build Models (Create Views)

To build the dbt models (e.g., create views from the seeded data), run:

dbt run

This will execute the SQL models in jaffle_shop/models/ and create the corresponding views in your database.


Example Workflow

# Activate the virtual environment
source dbt_env/bin/activate

# Install dbt and Postgres adapter
pip install dbt-core dbt-postgres

# (Optional) Edit ~/.dbt/profiles.yml to configure your Postgres connection

# Seed the raw data
dbt seed

# Build the models (create views)
dbt run

Additional Resources

About

Learn and Understand the basics of being a data engineer and work with DBT.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published