Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/workflows/export-to-airflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Export Repo to Airflow

on:
push:
branches: [ "main" ]

permissions:
contents: read

jobs:
send-to-airflow-s3-bucket:
runs-on: ubuntu-latest

steps:
- name: send repository dispatch
run: |
curl -L \
-X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${{ secrets.INCLUDE_AF_PAT }}" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://api.github.com/repos/include-dcc/include-dbt-airflow-mirror/dispatches \
-d '{"event_type":"export-to-airflow","client_payload":{"repo": "include-dbt-sandbox", "ref": "'${{ github.ref }}'", "run_id": "'${{ github.run_id }}'"}}'
93 changes: 93 additions & 0 deletions dags/tutorial.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
"""
This DAG is a tutorial example DAG that demonstrates the basic structure
of an Airflow DAG and how to use the BashOperator. It's taken from the
Airflow documentation here https://airflow.apache.org/docs/apache-airflow/stable/tutorial/fundamentals.html.

This dag prints the current date, then sleeps for 5 seconds, and finally
runs a templated bash command that prints the execution date 5 times.
"""

import textwrap
from datetime import datetime, timedelta

# The DAG object; we'll need this to instantiate a DAG
from airflow.models.dag import DAG

# Operators; we need this to operate!
from airflow.operators.bash import BashOperator

with DAG(
"tutorial",
# These args will get passed on to each operator
# You can override them on a per-task basis during operator initialization
default_args={
"depends_on_past": False,
"email": ["[email protected]"],
"email_on_failure": False,
"email_on_retry": False,
"retries": 1,
"retry_delay": timedelta(minutes=5),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
# 'wait_for_downstream': False,
# 'sla': timedelta(hours=2),
# 'execution_timeout': timedelta(seconds=300),
# 'on_failure_callback': some_function, # or list of functions
# 'on_success_callback': some_other_function, # or list of functions
# 'on_retry_callback': another_function, # or list of functions
# 'sla_miss_callback': yet_another_function, # or list of functions
# 'on_skipped_callback': another_function, #or list of functions
# 'trigger_rule': 'all_success'
},
description="A simple tutorial DAG",
schedule=timedelta(days=1),
start_date=datetime(2021, 1, 1),
catchup=False,
tags=["example"],
) as dag:

# t1, t2 and t3 are examples of tasks created by instantiating operators
t1 = BashOperator(
task_id="print_date",
bash_command="date",
)

t2 = BashOperator(
task_id="sleep",
depends_on_past=False,
bash_command="sleep 5",
retries=3,
)
t1.doc_md = textwrap.dedent(
"""\
#### Task Documentation
You can document your task using the attributes `doc_md` (markdown),
`doc` (plain text), `doc_rst`, `doc_json`, `doc_yaml` which gets
rendered in the UI's Task Instance Details page.
![img](https://imgs.xkcd.com/comics/fixing_problems.png)
**Image Credit:** Randall Munroe, [XKCD](https://xkcd.com/license.html)
"""
)

dag.doc_md = __doc__ # providing that you have a docstring at the beginning of the DAG; OR
dag.doc_md = """
This is a documentation placed anywhere
""" # otherwise, type it like this
templated_command = textwrap.dedent(
"""
{% for i in range(5) %}
echo "{{ ds }}"
echo "{{ macros.ds_add(ds, 7)}}"
{% endfor %}
"""
)

t3 = BashOperator(
task_id="templated",
depends_on_past=False,
bash_command=templated_command,
)

t1 >> [t2, t3]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment what this does

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a docstring at the top of the script indicating what it does. Good idea!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this syntax of BashOperators do?

Copy link
Collaborator Author

@chris-s-friedman chris-s-friedman Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - are you asking about t1 >> [t2, t3]? That just says that t2 and t3 are both dependent on t1 running.

otherwise, that bashoperator:

t3 = BashOperator(
        task_id="templated",
        depends_on_past=False,
        bash_command=templated_command,
    )

instructs airflow to run the commands in templated_command, in this case

    {% for i in range(5) %}
        echo "{{ ds }}"
        echo "{{ macros.ds_add(ds, 7)}}"
    {% endfor %}