-
Notifications
You must be signed in to change notification settings - Fork 0
Accessing files
Workflow Steps are simply Jobs. Steps, like Jobs, run is a subdirectory of a Project. Neither Jobs nor Steps have general access to Project files.
If a step needs to access Project files, or files generated by other steps it declares this need in the step's "plumbing".
If a Step needs to access a Project file it must make two declarations.
- The Workflow must declare an input Workflow Variable. When a user runs the Workflow they are required to provide a value for the input - a file in the Project volume. This is enforced by the DM, the Workflow Engine can assume that all input variables have been declared.
- Secondly, any Step that wishes to use this named file needs to declare it in their plumbing.
Here's an example of a Workflow input variable: -
variables:
inputs:
type: object
properties:
candidateMolecules:
title: Molecules
type: file
When the user runs the workflow they will be required to provide a value for the variable candidateMolecules
, the name of a file in the Project.
Project files are not presented to every step, so step that expects a project file must also declare this. This is fone in the plumbing section, illustrated in this workflow excerpt: -
step:
- name: step-1
plumbing:
- variable: inputFile
from-workflow:
variable: candidateMolecules
The plumbing reveals two important facts about the step: -
- The step's Job has a variable called
inputFile
- The step expects the
inputFile
value to be set the the value of the workflow variablecandidateMolecules
Steps can also use files that are expected to have been created by other (prior) steps. The step's plumbing is used to declare this relationship.
In the following workflow excerpt, step-2
uses a file from step-1
whose name is expected to be in step-1's outputFile
variable. The from-step
tells the workflow which variable to use and the name of the step to get it from.
- name: step-2
plumbing:
- variable: inputFile
from-step:
name: step-1
variable: outputFile
What process puts files into a step's instance directory?
It is the DM that places files into a step's instance directory, with help from the engine, which passes the values of the selected workflow files, and step instances a LaunchParameters
object that is passed to the InstanceLauncher launch()
command. Files are not copied. Dependent files (or directories) are hard-linked into a step's instance directory, illustrated by the following diagram: -

- Project files are hard-linked into the step's instance directory
- Prior step files are made available by linking the prior step's entire instance directory
By hard-linking the DM saves on file-space. As in input file (or directory), the step is not expected to modify these files the file-system does not prevent it. Although modifying an input file is generally discouraged as a pattern the user must understand the consequences of doing so, or even locking the file should multiple steps want to access the file concurrently.