Workflow definitions (2.0)

Workflows are defined using YAML. The schema for a workflow can be found in the engine's repository (workflow/workflow-schema.yaml). It can be used as a schema in an editor using a suitable URL. As an example, here's the URL for this version: -

https://raw.githubusercontent.com/InformaticsMatters/squonk2-data-manager-workflow-engine/refs/heads/2.0/workflow/workflow-schema.yaml

What follows is a discussion of workflows for version 2.0 of the workflow engine. You can find a number of example workflows in the engine repository's test/workflow-definitions directory.

In general, a workflow definition requires a declaration of the following: -

Information
Variables
Variable Mapping
Steps

Information

All workflows need an information section, typically found at the start of the definition. The supported root-level properties are described in the following table: -

Property	Type	Description
kind	String	A constant value that needs to be set to `DataManagerWorkflow` that will not change in future engine versions
kind-version	String	An enum string value that needs to be set to `2025.2` for version 2.0
name	String	The workflow name. An RFC1035 label name compliant string. Essentially up to 64 lower-case letters, digits and hyphens (not ending with `-`)
description	String	An optional free-format text property that provides the user with a high-level description of the workflow

Here's an example: -

---
kind: DataManagerWorkflow
kind-version: "2025.2"
name: nop-fail
description: >-
  A workflow with one step that fails

Variables

An optional (although typically always present) root-level property variables declares all the inputs, outputs, and options a user is expected to provide when they execute the workflow.

The workflow's schema does not cover this section, its structure is defined by the Data Manager Job definitions schema located in our squonk2-data-manager-job-decoder repository. The workflow engine currently does not use this section, it is present to simplify UI development by allowing it to reuse variable logic used when launching Jobs.

Variable Mapping

A workflow that contains variables (and all of them do) must contain a root-level variable-mapping property. This is a structure that defines the mapping of workflow variable to the variables of the individual Jobs within the workflow. There are sections for inputs, ouptuts, and options.

Inputs

We refer to inputs when we discuss files expected by Jobs.

It may seem to duplicate values form the variables section. inputs is an array that simply identifies input variable names used somewhere within the workflow. It's a temporary structure in 2.0 to simplify UI development by separating workflow semantics from Job execution.

Here's an example: -

variable-mapping:
  inputs:
  - name: input-1.sdf
  - name: input-2.sdf

Outputs

We refer to outputs when we discuss files created by Jobs.

The outputs section of the variable-mapping block is an array that declares all the outputs of the workflow. These will typically be the outputs of one of the Jobs the workflow executes. Each output declaration contains the name of a workflow output variable and the Job and output variable that is its source.

Jobs in a workflow are defined in a step, which will be covered in the steps section below.

In the following example the workflow declares two outputs: output-a will be a copy of the file named in the Job variable output-file used in step-4, and output-b, which will be a copy of the file named in the Job variable output-file used in step-5: -

variable-mapping:
  outputs:
  - name: output-a
    from:
      step: step-4
      output: output-file
  - name: output-b
    from:
      step: step-5
      output: output-file

The name of a workflow output file is set by the variable in the variables section whose name is provided as the variable-mapping->outputs->name value.

Warning

The workflow engine currently is not looking at the variables section and clearly is unable to satisfy this part of the design at the moment. Instead the name of the file will be the value of the corresponding step output variable.

When a workflow finishes (successfully) the outputs are copied (sym-linked) to the Project directory the workflow is executed in. Files not declared an output are left in their Job executions directory, which is separate to the Project directory. This behaviour is the responsibility of the Data Manager.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Workflow definitions (2.0)

Information

Variables

Variable Mapping

Inputs

Outputs

Options

Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

The Workflow Engine

Clone this wiki locally