Skip to content

Commit 56afad7

Browse files
authored
Extend docs on workflows (#206)
* Add explainer and examples on the role of workflow2 --------- Co-authored-by: ndaelman <ndaelman@physik.hu-berlin.de>
1 parent 638843d commit 56afad7

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

docs/general.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,3 +110,56 @@ class SUPERCODEParser:
110110
# append `Simulation` as an `archive.data` section
111111
archive.data.append(simulation)
112112
```
113+
114+
## Homogenization and the role of Workflows
115+
116+
Workflows are a device for annotating the structure of a set of entries in a standardized way.
117+
A community can define a workflow schema, i.e. its standout sections, without any knowledge of the underlying entries.
118+
As such, workflows define a homogenized data format with rich semantics that act as the starting point wherefrom to explore the dataset.
119+
The actual workflow entry instance, meanwhile, handles the coordination between tasks and the underlying data.
120+
This mapping may be a mixture of (workflow) entries and sections.
121+
122+
Below are a few examples of actual mappings.
123+
Important to note is that these examples already presuppose a certain structure on the side of the referenced `archive.data` sections.
124+
In reality, workflow should be capable of hosting multiple underlying structures.
125+
126+
### Serial Updates to `ModelSystem`
127+
128+
Geometry optimizations or molecular dynamics simulations, for example, typically store their data in a single entry.
129+
Tasks trace the updates to the system, i.e. calculation or time steps, respectively.
130+
The actual modifications of these steps are stored in the `model_system` and `outputs`.
131+
<!-- @JFRudzinski pls double-check the order of the arrows -->
132+
133+
```
134+
entry_x#workflow2.task[0].input -> entry_y1#data.model_system[0]
135+
entry_x#workflow2.task[0].output -> entry_y2#data.outputs[0]
136+
...
137+
entry_x#workflow2.task[n].input -> entry_y1#data.model_system[n]
138+
entry_x#workflow2.task[n].output -> entry_y2#data.outputs[n]
139+
```
140+
141+
Note that `entry_x`, `entry_y1`, `entry_y2` will typically refer to the same entry, though this isn't a hard requirement.
142+
143+
#### Including SCF steps
144+
145+
!!! warning "Under construction"
146+
This section will be updated once its schema-side is settled on.
147+
148+
### Single-Point SCF Workflow
149+
150+
In the case of a single-point calculation, the emphasis clearly lies on the relaxation of the electronic structure.
151+
Now, each `outputs` section can wholesale be dedicated to following the SCF cycle.
152+
153+
```
154+
entry_x#workflow2.task[0].output -> entry_y#data.outputs[0]
155+
```
156+
157+
### Multi-step Electronic Workflows
158+
159+
There are two main design choices here:
160+
161+
1. the methodology and computed outputs are split along major subroutines.
162+
2. they are kept in a single entry. This is especially useful for legacy cases, where some subroutines were originally not distinguished.
163+
164+
In option 2, for any workflows at are not simply _serial_, there is no canonical way of ordering `outputs`.
165+
This burden remains with `workflow2.tasks`.

0 commit comments

Comments
 (0)