Skip to content

dkapitan/dagster-data-station

Repository files navigation

🧰 Dagster Data Station

Demonstration of an end-to-end data station with Dagster as the main orchestrator. This data station follows the solution design of a lakehouse, leveraging new technologies such as Polars and DuckDB such that it can process up to 1 TB of data on a single node. Data stations are envisaged to parcipitate in a network for federated analytics or in a data mesh.

Getting started

Installing dependencies

Option 1: uv

Ensure uv is installed following their official documentation.

Create a virtual environment, and install the required dependencies using sync:

uv sync

Then, activate the virtual environment:

OS Command
MacOS source .venv/bin/activate
Windows .venv\Scripts\activate

Option 2: pip

Install the python dependencies with pip:

python3 -m venv .venv

Then activate the virtual environment:

OS Command
MacOS source .venv/bin/activate
Windows .venv\Scripts\activate

Install the required dependencies:

pip install -e ".[dev]"

Running Dagster

Start the Dagster UI web server:

dg dev

Open http://localhost:3000 in your browser to see the project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages