|
| 1 | + |
| 2 | +.. _workflows: |
| 3 | + |
| 4 | +######### |
| 5 | +Workflows |
| 6 | +######### |
| 7 | + |
| 8 | + |
| 9 | +Flux is designed for scientific :term:`workflows <workflow>`. |
| 10 | +A workflow naturally maps to a Flux instance, with orchestration of its jobs |
| 11 | +by its :term:`initial program`. Since a Flux instance is also a job, Flux |
| 12 | +workflows are highly composable. Because Flux runs as a parallel job under |
| 13 | +other resource managers such as Slurm, Flux workflows can be pleasantly |
| 14 | +portable compared to other approaches. |
| 15 | + |
| 16 | +Workflow orchestration can be a simple batch script that runs jobs in |
| 17 | +sequence or with simple job dependencies, or it can be a more sophisticated |
| 18 | +workflow application like |
| 19 | +`Maestro <https://maestrowf.readthedocs.io/en/latest/Maestro/index.html>`_. |
| 20 | + |
| 21 | +A workflow orchestrator can make full use of Flux's distributed services |
| 22 | +such as its key-value store. |
| 23 | + |
| 24 | +************** |
| 25 | +KVS Guidelines |
| 26 | +************** |
| 27 | + |
| 28 | +The Flux KVS provides a distributed, *eventually consistent*, persistent |
| 29 | +data store. It supports atomic commits and synchronization via messages. |
| 30 | +The :doc:`kvs` design document describes it in more detail. |
| 31 | + |
| 32 | +The KVS is highly scalable for some use cases, such as sharing data with many |
| 33 | +processes in a large parallel job. However, it is not always the preferred |
| 34 | +way to store workflow data. The following guidelines may be helpful for |
| 35 | +understanding how to use it effectively. |
| 36 | + |
| 37 | +.. note:: |
| 38 | + |
| 39 | + This section is under construction and is currently pretty sparse! |
| 40 | + For now, please open an issue or discussion in the flux-core github |
| 41 | + repo if you have specific KVS questions or would like to discuss your |
| 42 | + workflow storage and synchronization requirements. |
| 43 | + |
| 44 | +- The default location for the KVS backing store is ``/tmp`` on the first |
| 45 | + node of the Flux instance's allocation. On some systems, this may be |
| 46 | + a ramdisk with limited space. See the description of ``statedir`` in |
| 47 | + :man7:`flux-broker-attributes` for info on redirecting the KVS backing |
| 48 | + store to another location. |
| 49 | + |
| 50 | +- By default a batch job's KVS content is cleaned up when the instance |
| 51 | + terminates. Use the :option:`flux batch --dump` option to preserve KVS |
| 52 | + content. |
| 53 | + |
| 54 | +- Job data is stored in the KVS under the ``job`` directory. The jobs |
| 55 | + themselves may store data in the KVS directory assigned to their job. |
| 56 | + This convention is described in :doc:`rfc:spec_16`. |
| 57 | + |
| 58 | +- The KVS backing storage requirements go way up if there is significant |
| 59 | + churn in content, since every change percolates to the root of the hash |
| 60 | + tree and all versions of the tree are preserved. Rewriting a key many |
| 61 | + times during execution may be considered an anti-pattern. |
| 62 | + |
| 63 | +- Keys with huge values can cause head of line blocking in the broker, |
| 64 | + where the transfer of a large KVS message through a shared channel |
| 65 | + delays other messages. A parallel file system is a better place to |
| 66 | + store big data. |
| 67 | + |
| 68 | +- :man1:`flux-archive` may be helpful in some use cases, especially where |
| 69 | + the data becomes input to subsequent jobs because then the `stage-in` |
| 70 | + job shell plugin can be used. |
0 commit comments