Learn and understand the basics of being a data engineer and work with DBT.
Please remember to add your local db details in profiles.yml file. You can also setup the dbt for BigQuery, SnowFlake, RedShift and DataBricks.
This project demonstrates how to use dbt (Data Build Tool) for data engineering workflows, including setting up a local development environment, seeding raw data, and building views in a local Postgres database.
- Python 3.7+ (recommended: Python 3.11)
- PostgreSQL (local instance running and accessible)
- pip (Python package manager)
A Python virtual environment is already present in the dbt_env/
directory. If you need to recreate it, run:
python3 -m venv dbt_env
Activate the virtual environment:
- macOS/Linux:
source dbt_env/bin/activate
- Windows:
dbt_env\Scripts\activate
You should see (dbt_env)
in your terminal prompt when the environment is active.
With the virtual environment activated, install dbt and the Postgres adapter:
pip install dbt-core dbt-postgres
- Edit your
profiles.yml
(usually in~/.dbt/profiles.yml
) to point to your local Postgres database. - Update connection details as needed for your environment.
To load the raw data from CSV files in jaffle_shop/seeds/
into your Postgres database, run:
dbt seed
This will insert the data into your database as tables.
To build the dbt models (e.g., create views from the seeded data), run:
dbt run
This will execute the SQL models in jaffle_shop/models/
and create the corresponding views in your database.
# Activate the virtual environment
source dbt_env/bin/activate
# Install dbt and Postgres adapter
pip install dbt-core dbt-postgres
# (Optional) Edit ~/.dbt/profiles.yml to configure your Postgres connection
# Seed the raw data
dbt seed
# Build the models (create views)
dbt run