Skip to content

Event system

Alan B. Christie edited this page Aug 21, 2025 · 6 revisions

The workflow engine is distributed via PyPI and imported into the DM, which relies on importing the WorkflowEngine class: -

from workflow.workflow_engine import WorkflowEngine

The DM is responsible for providing access to an API adapter the engine can call to access the DM database and an InstanceLauncher that will launch Step Jobs (Instances). The DM provides these objects when instantiating the engine, exposed here as the variables _DataManagerWorkflowAPIAdapter, and _DataManagerInstanceLauncher: -

_WorkflowEngine: WorkflowEngine = WorkflowEngine(
    wapi_adapter=_DataManagerWorkflowAPIAdapter,
    instance_launcher=_DataManagerInstanceLauncher,
)

Once initialised the engine is required to respond to Events (messages) provided by the DM as it calls the engine's handle_message(). This method must be designed not to block, and is required to complete with minimal delay.

Events

Events sent to the workflow engine are Protocol Buffer objects (proto 3). For details of the protocol buffer framework refer to the ProtoDev documentation. The protocol buffers used by the DM (and AS) can be found in our squonk2-protobuf repository, which is distributed as a package to PyPI.

Only two protocol buffer message types are sent to the engine from the DM: -

  • datamanager.WorkflowMessage
  • datamanager.PodMessage

WorkflowMessage

This message is sent by the Data Manager in response to an API call to either start or stop a Workflow. The DM is responsible for creating the RunningWorkflow database record before sending the message. The content of a typical start message, carrying the UUID of the running workflow to start will look like this:

action = "START"
running_workflow = "r-workflow-00000000-0000-0000-0000-000000000000"

A stop message's action will be "STOP"

PodMessage

This message is sent by the DM when an Instance Pod, one that is part of a Workflow Step, completes, successfully or otherwise. Salient parts of the protocol buffer message are:

has_exit_code = (True if there is an exit code, ignore the message when False)
exit_code = (0 on success, any other value indicates failure)
instance = (UUID of the Data Manager Step/Job Instance)

Using the instance property (a UUID string identifying the Step's Instance) the engine can retrieve the database Instance object, which will contain a reference to the RunningWorkflowStep record. Using this, the engine should be able to advance the workflow, starting the next step or terminating the running workflow.

Basic operation

The following simplified message sequence diagram illustrates the engine's basic operation. For clarity some components are not shown. These include the RabbitMQ messaging service, Job operator, and Kubernetes. Don't worry if you do not understand the role of these missing services - they're not needed for this topic.

A

A user uses the DM UI or its REST API to "run" a workflow. After checking the legitimacy of the request the API creates a new RunningWorkflow record and then initiates the workflow by dispatching a WorkflowMessage START containing the record identity. The API then returns a CREATED API response to the caller, also giving the user the running workflow record ID. The message is received by the DM Protocol Buffer Consumer (PBC) Pod that passes the message to an instance of the WorkflowEngine class via its handle_message() method. All messages sent to the workflow engine come from the PBC.

B

The Workflow engine handles this START message by using Workflow API object to obtain RUnningW (given to the workflow engine along with an instance launcher during the engine's initialisation by the PBC) to _

D

Clone this wiki locally