This is sample project to domonstrate how to implement automation in Kestra with emphasis on fully containarized runtime execution and GitOps-first approcah:
-
Executes commands in container images (OCI image) - e.g. execute Python scripts inside container image (with additional parameters)
-
Building reusable subflows (modular approach) (using
io.kestra.plugin.core.flow.Subflowplugin) -
Building various types of workflows - parallel execution of tasks (using
io.kestra.plugin.core.flow.Parallelplugin), parallel with sequential parts (combination ofio.kestra.plugin.core.flow.Parallelandio.kestra.plugin.core.flow.Sequentialplugins) -
Triggering another workflow when the first workflow is finished (successfully).
-
Building container image (OCI image) with application (designed to build applications with scripting languages like Python, PowerShell, Bash,...) - using Kestra as CI/CD tool
-
...
About OCI image: https://opencontainers.org/
Sample Python Automation Boilerplate project that is used in workflows: https://github.com/leonkosak/python-automation-boilerplate
It's important to understand that this project is fully standalone and there "unaware" of potential orchestrator which makes an actual automation (executing shell commands inside io.kestra.plugin.docker.Run in (sub)workflows).
Described approach takes modular approach even further - easier replacement of orchestrator in the future if needed.
-
Even this demo project for Kestra is based on sample Python project, it can be be used with minimal amount of trivial changes (especially for building and executing other projects written in scripting languages). This is possible due to "fully-containarized approach" - from development to deployment.
-
For building container (OCI) images in languages that requires compilation process (e.g. Java, C#, C++,...)there are more changes in workflow which makes CI/CD functionality. Other workflows which execute automation require minimal amount of changes (mostly just changed shell commandes) inside
io.kestra.plugin.docker.Runpart (in workflows and subflows).
Because of GitOps-first approach of development, it's recommended that development of workflows is done outside Kestra.
(It's possible to set workflow in Kestra that periodically commits workflows (.yml files) to Git repository, but here the opposite approach is implemented - specific workflow in Kestra which periodically from a specific Git repositry workflow definitions (.yml files).
Development of Kestra workflows is therefore done in Visual Studio Code with Kestra extension installed and produced workflow definitions pushed to Git repo.)
- Visual Studio Code
- Kestra extension installed: https://marketplace.visualstudio.com/items?itemName=kestra-io.kestra
- YAML (by Red Hat) extension: https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml
- Git
Important
When Kestra extension is installed, command prompt should be opened in visual Studio Code (press F1 function key) and start typing "Kestra". Select option Kestra: Download Kestra schema when appears.
Press "Enter" to download schema from official URL and wait notification message about successfull downloading in the right-bottom corner of Visual Studio Code.
It's recommended to periodically refresh this schema (downloading) to be aligned with newer versions of Kestra.
When YAML file (.yml) is created and opened, there should be selected kestra:/flow-schema.json shown.
When .yml file is opened (active window) in Visual Studio Code, there is purple (Kestra color) icon on te right side inside the row where tabs are present. Clicking on this icon opens dual-side panel where documentation appears.
This documentation part window actively changes based cursor position inside .yml file. It shows documentation based on defined Kestra plugin in this row.
This documentation window can also be opened by pressing F1 function key and selecting option Kestra: Open Kestra documentation (start typing "Kestra").
Because of the decision, that a specific Kestra workflow periodically fetches Git repository for changes, there should be workflow inside Kestra established before this periodic operation can be executed.
-
It's highly-recommended that for each environment (e.g. test, production) dedicated instance is established in order to have full predictibility when developing workflows and also when upgrading Kestra itself.
-
It's also highly-recommended that each environment (and consequently Kestra instance) is associated with defined Git branch name (where workflow definitions are taken/synchronised).
-
In order to make Git operations easier (e.g. merging between branches,...), it's important to recognize specific values inside workflow definitions (.yml files) and "extract" them to KV store (Key-Value Store) and Secrets (when specific values are sensitive to operate/handle).
-
It's highly-recommended that all workflow definitions are stored inside one folder. The default folder name for workflows in Kestra is
_flowsand then organised with subfolder structure. -
Development team should decide where "technically-related" workflows (such as syncing workflows (.yml files), building OCI images,...) be stored.
-
If the development team decide to leverage predefined
systemnamespace for such workflows, then it's recommended to make folder named_flows_systemalonside_flowsfolder and place technically-related workflows inside it(and reference "system" for namespace in .yml files).
It's important not to forget write synchronization workflow for this system workflows (or write additional task inside existing workflow which synchronizes content-related workflows inside_flowsfolder.
If Kestra is used by more development teams, it's important to know that maybe other teams also create workflows insystemnamespace and therefore when writing workflow which makes synchronization with system workflows SHOULD NOT delete other workflow definition inside this namespace!
In what ways is system namespace different from other namespaces in Kestra: https://kestra.io/docs/concepts/system-flows -
If the development team decide to treat it's own system workflows as regular content-related workflows, then it's reccommended to create folder named
systemsomewhere inside_flowsfolder structure and place system-related workflows (and set namespace in such workflows correctly).Important
-
When a new Kestra instance is established, manually copy workflow which executes Git synchronizations and execute it to bootstrap initial state of workflows from Git.
(If this workflow has defined periodic trigger, then automation for workflow synchronization is established.) -
It's also recommended using entries from KV store and Secrets in synchronization-based system workflows due to easier GitOps operations.
-
-
-
When Kestra instance is established it's highly recommended to define and set some entries in
KV Store(Key-Value Store).
In case of GitOps approach, this properties would be (examples):-
ENVIRONMENT_GIT_BRANCH_NAME: Git branch name which holds the name for the current environment (e.g. "main", "master", "test", "prod", "stagging",...)
This KV entry is useful when there is more phases in software (workflow) development and development team wants to have this info stored in one place and referenced in multiple workflows.
It's recommended that this Key-Value item is stored insidesystemnamespace for the sake of convention ("technically-related" KV item info). -
ENVIRONMENT_NAME: "Friendly" name for environment (in case thar Git branch name is not good enough in some cases (e.g. some reports,...)) -
ENVIRONMENT_WORKFLOW_SYNC_USER_GIT_USERNAME: Username for user on Git that have access to repository where workflow definitions are stored.
Needed only if this Git repo is protected.
If development team prefer, username can also be stored inside Secrets.
Those are just examples. The real KV Store items should be defined based on specific project.
Other example entries suggestions: URLs to Git repos, URLs to container images to container registries,...
It's recommended to use highly-descriptive names for Keys in KV Store in order to prevent confusions when number of items grows (consider also DDD naming conventions: https://medium.com/unil-ci-software-engineering/clean-ddd-lessons-project-structure-and-naming-conventions-00d0b9c57610). -
-
When Kestra instance is established it's highly recommended to define and set some entries in
Secrets.
All recommendations written for KV Store are applicable also for Secrets.
Development team should identify which items should got to KV Store and which to Secrets based on data value sensitivity.Important
Secrets functionality in Kestra behaves differently compared to KV Store.
In Kestra, KV Store items can be referenced also by defining a namespace in which are stored (e.g.{{ kv('MY_KV_STORE_KEY', 'namespace_where_item_is_stored') }}) which makes specific KV Store item globally available for referencing.
This is not the case for Secrets. Inside a specific workflow only secrets inside the same namespace or parent namespaces can be accessed.
-
Start planning workflow namespace (folder) structure early, because later changes may break workflows (namespace naming changes in subflows,...).
For larger projects it's highly recommended to strictly follow (defining) DDD naming conventions and namespace nesting.
Based on how namespaces should be named based on DDD conventions, the names should start from "general to more specific.
Example: com.mycompany.department.team. ...
Defining namespace structure this way, it also helps filtering workflows and other related operations. -
Use subflows (
io.kestra.plugin.core.flow.Subflowplugin) and create minimal fully-rounded (content-wise) execution units that can be executed as standalone unit if needed.
Use these subflows inside taksk of bigger workflows as modular units. -
If possible - pack all automation application logic as OCI (docker) image and then use
io.kestra.plugin.docker.Runplugin to execute commands inside container.
(Build and save OCI images locally - there is no need to store those images to container registry if there is source code and Dockerfile definition to build quickly from scratch.)
Executing logic inside container also makes runtime more predictable and portable (replacing orchestrator,...). -
For "serious" workflow and automation development, it's recommendded that Kestra is used just as orchestrator and monitor platform. Use GitOps-first approach and develop workflows and application logic with other developer tools (e.g. Visual Studio Code) and commit to Git from them.
- Workflows Synchronization from Git
id: sync_flows_from_git
namespace: company.team
tasks:
- id: sync_flows
type: io.kestra.plugin.git.SyncFlows
gitDirectory: _flows # optional; set to _flows by default
targetNamespace: "system" # required
includeChildNamespaces: true # optional; by default, it's set to false to allow explicit definition
delete: true # optional; by default, it's set to false to avoid destructive behavior
url: "https://<url_to_git_repository>" # required
branch: "{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}"
username: "{{ kv('ENVIRONMENT_WORKFLOW_SYNC_USER_GIT_USERNAME', 'system') }}" # if Git repo is protected
password: "{{ secret('ENVIRONMENT_WORKFLOW_SYNC_USER_GIT_TOKEN') }}" # if Git repo is protected
dryRun: false # if true, the task will only log which flows from Git will be added/modified or deleted in kestra without making any changes in kestra backend yet
triggers:
- id: every_minute
type: io.kestra.plugin.core.trigger.Schedule
cron: "*/2 * * * *" # every 2 minutes
- Build (Create) OCI (container) image based on sample Python project
id: build_python_oci_image_deploy_multistage
namespace: company.team.system
tasks:
- id: workspace
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: generate_build_version
type: io.kestra.plugin.core.debug.Return
format: "{{ now() | date('yyyyMMdd-HHmmss') }}"
- id: clone_repo
type: io.kestra.plugin.git.Clone
url: "https://<url_to_git_repository>"
branch: "{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}"
username: "{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}" # if Git repo is protected
password: "{{ secret('ENVIRONMENT_WORKFLOW_SYNC_USER_GIT_USERNAME') }}" # if Git repo is protected
directory: repo
- id: get_commit_hash
type: io.kestra.plugin.scripts.shell.Commands
containerImage: alpine/git:latest
commands:
- |
cd repo
HASH=$(git rev-parse HEAD)
echo $HASH > $HASH
mv $HASH ..
outputFiles:
- "*"
- id: read_commit_hash
type: io.kestra.plugin.core.debug.Return
format: "{{ outputs.get_commit_hash.outputFiles | keys | first }}"
- id: copy_requirements
type: io.kestra.plugin.scripts.shell.Commands
commands:
- cp repo/requirements.txt repo/docker/deploy
- id: copy_src
type: io.kestra.plugin.scripts.shell.Commands
commands:
- cp -r repo/src repo/docker/deploy
- id: build_image
type: io.kestra.plugin.docker.Build
dockerfile: repo/docker/deploy/Dockerfile.multistage
push: false
tags:
- "local/project-python-deploy-{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}:latest"
- "local/project-python-deploy-{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}:{{ outputs.generate_build_version.value }}"
buildArgs:
BUILD_VERSION: "{{ outputs.generate_build_version.value }}"
BASE_BUILD_VERSION: "{{ outputs.generate_build_version.value }}"
SVC_COMMIT_INFO: "{{ outputs.read_commit_hash.value }}"
- id: inspect_image_final
type: io.kestra.plugin.docker.Run
containerImage: "local/project-python-deploy-{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}:latest"
commands:
- sh
- -c
- "echo BUILD_VERSION=$BUILD_VERSION && echo BASE_BUILD_VERSION=$BASE_BUILD_VERSION && echo SVC_COMMIT_INFO=$SVC_COMMIT_INFO && ls -R /app"
- Example Subflow YAML definition (based on sample Python project)
Use additional parameter namedcredentialsinsideio.kestra.plugin.docker.Runplugin if container image referenced bycontainerImageattribute requires authentication to container registry where this image is stored.
Usually, authentication to container registry is done via username and password/token. Example ofcredentialsobject definition:
credentials:
username: "{{ kv('ENVIRONMENT_CONTAINER_REGISTRY_USER_USERNAME', 'system') }}"
password: "{{ secret('ENVIRONMENT_CONTAINER_REGISTRY_USER_PWD') }}"
Consider to "compose" containerImage value using KV store items to make it configure globally if for instance image is transferred to different location (container registry)
Example:
containerImage: "{{ kv('ENVIRONMENT_CONTAINER_REGISTRY_BASE_URL', 'system') }}/project-python-deploy-{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}:latest"
id: script_demo_subflow_f1
namespace: company.team.demo
tasks:
- id: wd_01
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: run_01
type: io.kestra.plugin.docker.Run
containerImage: "local/project-python-deploy-{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}:latest"
commands:
- python3
- /app/src/<path_to_file>/<filename>.py
- --orchestrator
- KESTRA
- Example workflow that demonstrates parallel execution of four subflows
id: parallel_four_features
namespace: company.team.demo
tasks:
- id: all_parallel
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: run_01
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f1
namespace: company.team.demo
- id: run_02
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f2
namespace: company.team.demo
- id: run_03
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f3
namespace: company.team.demo
- id: run_04
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f4
namespace: company.team.demo
- Example workflow where features F1 and F3 executes in parallel in the first batch and if both are successfull, then next batch with features F2 and F4 are executed also in parallel
id: parallel_scripts_in_batches_subtasks
namespace: company.team.demo
tasks:
- id: sequential_batches
type: io.kestra.plugin.core.flow.Sequential
tasks:
- id: batch_01_03
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: run_01
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f1
namespace: company.team.demo
- id: run_03
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f3
namespace: company.team.demo
- id: batch_02_04
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: run_02
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f2
namespace: company.team.demo
- id: run_04
type: io.kestra.plugin.core.flow.Subflow
flowId: script_demo_subflow_f4
namespace: company.team.demo
- Example how to trigger another workflow when the first finishes (with conditions)
(The definition below should be placed inside the second workflow (the workflow which is called) as one of triggers.)
The second workflow is triggered even the first workflow fails.
triggers:
- id: trigger_second_flow
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionFlow
namespace: company.team.demo
flowId: first_workflow_id
states:
- SUCCESS
- FAILED
- WARNING
- Example how to set and pass environment variables inside
io.kestra.plugin.docker.Runplugin for (sub)workflows
(This is a good solution to pass secrets as runtime defined variables from Kestra secrets store (e.g. not "baked-in" variables in OCI (docker) image).)
id: script_demo_passing_environment_variables
namespace: company.team.demo
tasks:
- id: wd_01
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: run_01
type: io.kestra.plugin.docker.Run
containerImage: "local/project-python-deploy-{{ kv('ENVIRONMENT_GIT_BRANCH_NAME', 'system') }}:latest"
env:
MY_ENV_VAL1: "{{ kv('MY_DESIRED_KEY', 'system') }}"
MY_ENV_VAL2: "{{ secret('MY_DESIRED_SECRET_KEY') }}"
commands:
- python3
- /app/src/<path_to_file>/<filename>.py
- --orchestrator
- KESTRA
(In programming languages, those environment vales can be read based on key (for this example MY_ENV_VAL1 and MY_ENV_VAL2)).
Python
- https://docs.python.org/3/library/os.html#os.getenv
- https://docs.python.org/3/library/os.html#os.environ
PowerShell