-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Dagster implements a nice separation between project creation and project execution, and I believe this approach could be very valuable for Kedro to resolve our struggles with heavy dependencies that are not needed in production.
This separation is not only about keeping creation-time and runtime dependencies separate (which helps keep production environments lean), but also about improving the overall developer experience.
1. Project creation
A new Dagster project is created using uvx and a dedicated project-creation tool:
uvx create-dagster@latest project my-project
cd my-project
source .venv/bin/activateHere, uvx runs the project generator in a temporary, isolated environment, so users do not need to install or manage the generator’s dependencies themselves.
We currently recommend doing something similar in our quickstart - using uvx with kedro new - but it makes less sense, because after that I still need to create a proper .venv for the same kedro package in order to run the project.
create-dagster automatically asks whether you want to run uv sync. If you answer yes, it creates a .venv with all dependencies from pyproject.toml installed.
A `uv` installation was detected. Run `uv sync`? This will create a uv.lock file and the virtual environment you need to activate in order to work on this project. If you wish to use a non-uv package manager, choose "n". (y/n) [y]:
At this step, we recommend using uv kedro run, which for me does not seem very clear, because it actually creates a .venv inside your project but activates it only temporarily. If I want to continue working with my new project, it feels better to activate the environment explicitly.
2. Project execution
Once the project is created and the .venv is activated, execution and orchestration are handled by the Dagster runtime:
dg devIf you need to modify your Dagster project and add some assets, you should use commands from the main package, such as dg scaffold defs.
I think implementing the same approach would allow us to make core Kedro less heavy:
- Move
kedro-newinto a separate library, since it brings Cookiecutter with it. I think it is possible to do this without a major release if we add a thin wrapper aroundkedro newin the Kedro CLI, which would use thekedro-newlibrary when it is installed or prompt the user to install it. - Keep the rest, such as
kedro pipeline create, insidekedro.
I also think it would be nice to embed uv sync into kedro new, the same way Dagster does.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status