- Install prek:
uv tool install prek - Enable commit hooks:
prek install - Never run pytest, python, or airflow commands directly on the host — always use
breeze. - Place temporary scripts in
dev/(mounted as/opt/airflow/dev/inside Breeze).
- Run a single test:
breeze run pytest path/to/test.py::TestClass::test_method -xvs - Run a test file:
breeze run pytest path/to/test.py -xvs - Run a Python script:
breeze run python dev/my_script.py - Run Airflow CLI:
breeze run airflow dags list - Type-check:
breeze run mypy path/to/code - Lint/format (runs on host):
prek run --all-files - Lint with ruff only:
prek run ruff --all-files - Format with ruff only:
prek run ruff-format --all-files - Build docs:
breeze build-docs
SQLite is the default backend. Use --backend postgres or --backend mysql for integration tests that need those databases. If Docker networking fails, run docker network prune.
UV workspace monorepo. Key paths:
airflow-core/src/airflow/— core scheduler, API, CLI, modelsmodels/— SQLAlchemy models (DagModel, TaskInstance, DagRun, Asset, etc.)jobs/— scheduler, triggerer, Dag processor runnersapi_fastapi/core_api/— public REST API v2, UI endpointsapi_fastapi/execution_api/— task execution communication APIdag_processing/— Dag parsing and validationcli/— command-line interfaceui/— React/TypeScript web interface (Vite)
task-sdk/— lightweight SDK for Dag authoring and task execution runtimesrc/airflow/sdk/execution_time/— task runner, supervisor
providers/— 100+ provider packages, each with its ownpyproject.tomlairflow-ctl/— management CLI toolchart/— Helm chart for Kubernetes deployment
- Users author Dags with the Task SDK (
airflow.sdk). - Dag Processor parses Dag files in isolated processes and stores serialized Dags in the metadata DB.
- Scheduler reads serialized Dags — never runs user code — and creates Dag runs / task instances.
- Workers execute tasks via Task SDK and communicate with the API server through the Execution API — never access the metadata DB directly.
- API Server serves the React UI and handles all client-database interactions.
- Triggerer evaluates deferred tasks/sensors in isolated processes.
- No
assertin production code. time.monotonic()for durations, nottime.time().- In
airflow-core, functions with asessionparameter must not callsession.commit(). Use keyword-onlysessionparameters. - Imports at top of file. Valid exceptions: circular imports, lazy loading for worker isolation,
TYPE_CHECKINGblocks. - Guard heavy type-only imports (e.g.,
kubernetes.client) withTYPE_CHECKINGin multi-process code paths. - Apache License header on all new files (prek enforces this).
- Add tests for new behavior — cover success, failure, and edge cases.
- Use pytest patterns, not
unittest.TestCase. - Use
spec/autospecwhen mocking. - Use
time_machinefor time-dependent tests. - Use
@pytest.mark.parametrizefor multiple similar inputs. - Test fixtures:
devel-common/src/tests_common/pytest_plugin.py. - Test location mirrors source:
airflow/cli/cli_parser.py→tests/cli/test_cli_parser.py.
Write commit messages focused on user impact, not implementation details.
- Good:
Fix airflow dags test command failure without serialized Dags - Good:
UI: Fix Grid view not refreshing after task actions - Bad:
Initialize DAG bundles in CLI get_dag function
Add a newsfragment for user-visible changes:
echo "Brief description" > airflow-core/newsfragments/{PR_NUMBER}.{bugfix|feature|improvement|doc|misc|significant}.rst
Always push to the user's fork, not to the upstream apache/airflow repo. Never push
directly to main.
Before pushing, determine the GitHub username with gh api user -q .login and identify the
user's fork remote from the existing remotes. Run git remote -v and look for a remote
pointing to github.com:<GITHUB_USER>/airflow.git where <GITHUB_USER> is not apache.
That is the user's fork remote. If no such remote exists, create the fork and add it:
gh repo fork apache/airflow --remote --remote-name <GITHUB_USER>Then push the branch to the user's fork remote and open the PR creation page in the browser with the body pre-filled (including the generative AI disclosure already checked):
git push -u <GITHUB_USER> <branch-name>
gh pr create --web --title "Short title (under 70 chars)" --body "$(cat <<'EOF'
Brief description of the changes.
closes: #ISSUE (if applicable)
---
##### Was generative AI tooling used to co-author this PR?
- [X] Yes — <Agent Name and Version>
Generated-by: <Agent Name and Version> following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
EOF
)"The --web flag opens the browser so the user can review and submit. The --body flag
pre-fills the PR template with the generative AI disclosure already completed.
Remind the user to:
- Review the PR title — keep it short (under 70 chars) and focused on user impact.
- Add a brief description of the changes at the top of the body.
- Reference related issues when applicable (
closes: #ISSUEorrelated: #ISSUE).
- Ask first
- Large cross-package refactors.
- New dependencies with broad impact.
- Destructive data or migration changes.
- Never
- Commit secrets, credentials, or tokens.
- Edit generated files by hand when a generation workflow exists.
- Use destructive git operations unless explicitly requested.