Skip to content

Commit 322f175

Browse files
authored
Merge pull request #781 from nipype/doc-string-update
Doc string and glossary update
2 parents e999fb1 + c3bb64b commit 322f175

File tree

7 files changed

+56
-211
lines changed

7 files changed

+56
-211
lines changed

docs/source/index.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,12 @@ Installation
2727

2828
Pydra is implemented purely in Python and has a small number of dependencies
2929
It is easy to install via pip for Python >= 3.11 (preferably within a
30-
`virtual environment`_):
30+
`virtual environment`_). To get the latest version you will need to explicitly specify
31+
greater than or equal to 1.0a, otherwise PyPI will install the last 0.* version:
3132

3233
.. code-block:: bash
3334
34-
$ pip install pydra
35+
$ pip install pydra>=1.0a
3536
3637
Pre-designed tasks are available under the `pydra.tasks.*` namespace. These tasks
3738
are typically implemented within separate packages that are specific to a given
@@ -41,7 +42,7 @@ ANTs_ (*pydra-ants*), or a collection of related tasks/workflows, such as Niwork
4142

4243
.. code-block:: bash
4344
44-
$ pip install pydra-fsl pydra-ants
45+
$ pip install pydra-tasks-fsl pydra-tasks-ants
4546
4647
Of course, if you use Pydra to execute commands within non-Python toolkits, you will
4748
need to either have those commands installed on the execution machine, or use containers

docs/source/reference/glossary.rst

Lines changed: 38 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -4,62 +4,69 @@ Glossary
44
.. glossary::
55

66
Cache-root
7-
The directory where cache directories for tasks to be executed are created.
8-
Task cache directories are named within the cache root directory using a hash
9-
of the task's parameters, so that the same task with the same parameters can be
10-
reused.
7+
The root directory in which separate cache directories for each job are created.
8+
Job cache directories are named within the cache-root directory using a unique
9+
checksum for the job based on the task's parameters and software environment,
10+
so that if the same job is run again the outputs from the previous run can be
11+
reuused.
1112

1213
Combiner
1314
A combiner is used to combine :ref:`State-array` values created by a split operation
1415
defined by a :ref:`Splitter` on the current node, upstream workflow nodes or
1516
stand-alone tasks.
1617

1718
Container-ndim
18-
The number of dimensions of the container object to be iterated over when using
19-
a :ref:`Splitter` to split over an iterable value. For example, a list-of-lists
20-
or a 2D array with `container_ndim=2` would be split over the elements of the
21-
inner lists into a single 1-D state array. However, if `container_ndim=1`,
22-
the outer list/2D would be split into a 1-D state array of lists/1D arrays.
19+
The number of dimensions of the container object to be flattened into a single
20+
state array when splitting over nested containers/multi-dimension arrays.
21+
For example, a list-of-list-of-floats or a 2D numpy array with `container_ndim=1`,
22+
the outer list/2D would be split into a 1-D state array consisting of
23+
list-of-floats or 1D numpy arrays, respectively. Whereas with
24+
`container_ndim=2` they would be split into a state-array of floats consisiting
25+
of all the elements of the inner-lists/array.
2326

2427
Environment
2528
An environment refers to a specific software encapsulation, such as a Docker
26-
or Singularity image, that is used to run a task.
29+
or Singularity image, in which a shell tasks are run. They are specified in the
30+
Submitter object to be used when executing a task.
2731

2832
Field
29-
A field is a parameter of a task, or a task outputs object, that can be set to
30-
a specific value. Fields are specified to be of any types, including objects
31-
and file-system objects.
33+
A field is a parameter of a task, or an output in a task outputs class.
34+
Fields define the expected datatype of the parameter and other metadata
35+
parameters that control how the field is validated and passed through to the
36+
execution of the task.
3237

3338
Hook
34-
A hook is a user-defined function that is executed at a specific point in the task
35-
execution process. Hooks can be used to prepare/finalise the task cache directory
39+
A hook is a user-defined function that is executed at a specific point either before
40+
or after a task is run. Hooks can be used to prepare/finalise the task cache directory
3641
or send notifications
3742

3843
Job
39-
A job is a discrete unit of work, a :ref:`Task`, with all inputs resolved
40-
(i.e. not lazy-values or state-arrays) that has been assigned to a worker.
41-
A task describes "what" is to be done and a submitter object describes
42-
"how" it is to be done, a job combines both objects to describe a concrete unit
43-
of processing.
44+
A job consists of a :ref:`Task` with all inputs resolved
45+
(i.e. not lazy-values or state-arrays) and a Submitter object. It therefore
46+
represents a concrete unit of work to be executed, be combining "what" is to be
47+
done (Task) with "how" it is to be done (Submitter).
4448

4549
Lazy-fields
4650
A lazy-field is a field that is not immediately resolved to a value. Instead,
47-
it is a placeholder that will be resolved at runtime, allowing for dynamic
48-
parameterisation of tasks.
51+
it is a placeholder that will be resolved at runtime when a workflow is executed,
52+
allowing for dynamic parameterisation of tasks.
4953

5054
Node
51-
A single task within the context of a workflow, which is assigned a name and
52-
references a state. Note this task can be nested workflow task.
55+
A single task within the context of a workflow. It is assigned a unique name
56+
within the workflow and references a state object that determines the
57+
state-array of jobs to be run if present (if the state is None then a single
58+
job will be run for each node).
5359

5460
Read-only-caches
5561
A read-only cache is a cache root directory that was created by a previous
56-
pydra runs, which is checked for matching task caches to be reused if present
57-
but not written not modified during the execution of a task.
62+
pydra run. The read-only caches are checked for matching job checksums, which
63+
are reused if present. However, new job cache dirs are written to the cache root
64+
so the read-only caches are not modified during the execution.
5865

5966
State
6067
The combination of all upstream splits and combines with any splitters and
61-
combiners for a given node, it is used to track how many jobs, and their
62-
parameterisations, need to be run for a given workflow node.
68+
combiners for a given node. It is used to track how many jobs, and their
69+
parameterisations, that need to be run for a given workflow node.
6370

6471
State-array
6572
A state array is a collection of parameterised tasks or values that were generated
@@ -84,8 +91,9 @@ Glossary
8491

8592
Worker
8693
Encapsulation of a task execution environment. It is responsible for executing
87-
tasks and managing their lifecycle. Workers can be local (e.g., a thread or
88-
process) or remote (e.g., high-performance cluster).
94+
tasks and managing their lifecycle. Workers can be local (e.g., debug and
95+
concurrent-futures multiprocess) or orchestrated through a remote scheduler
96+
(e.g., SLURM, SGE).
8997

9098
Workflow
9199
A Directed-Acyclic-Graph (DAG) of parameterised tasks, to be executed in order.

empty-docs/conf.py

Lines changed: 0 additions & 162 deletions
This file was deleted.

empty-docs/index.rst

Lines changed: 0 additions & 5 deletions
This file was deleted.

empty-docs/requirements.txt

Lines changed: 0 additions & 1 deletion
This file was deleted.

pydra/compose/base/task.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -196,11 +196,13 @@ def __call__(
196196
readonly_caches : list[os.PathLike], optional
197197
Alternate cache locations to check for pre-computed results, by default None
198198
audit_flags : AuditFlag, optional
199-
Auditing configuration, by default AuditFlag.NONE
200-
messengers : list, optional
201-
Messengers, by default None
202-
messenger_args : dict, optional
203-
Messenger arguments, by default None
199+
Configure provenance tracking. available flags: :class:`~pydra.utils.messenger.AuditFlag`
200+
Default is no provenance tracking.
201+
messenger : :class:`Messenger` or :obj:`list` of :class:`Messenger` or None
202+
Messenger(s) used by Audit. Saved in the `audit` attribute.
203+
See available flags at :class:`~pydra.utils.messenger.Messenger`.
204+
messengers_args : messengers_args : dict[str, Any], optional
205+
Argument(s) used by `messegner`. Saved in the `audit` attribu
204206
**kwargs : dict
205207
Keyword arguments to pass on to the worker initialisation
206208

pydra/engine/submitter.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -64,11 +64,13 @@ class Submitter:
6464
max_concurrent: int | float, optional
6565
Maximum number of concurrent tasks to run, by default float("inf") (unlimited)
6666
audit_flags : AuditFlag, optional
67-
Auditing configuration, by default AuditFlag.NONE
68-
messengers : list, optional
69-
Messengers, by default None
70-
messenger_args : dict, optional
71-
Messenger arguments, by default None
67+
Configure provenance tracking. available flags: :class:`~pydra.utils.messenger.AuditFlag`
68+
Default is no provenance tracking.
69+
messenger : :class:`Messenger` or :obj:`list` of :class:`Messenger` or None
70+
Messenger(s) used by Audit. Saved in the `audit` attribute.
71+
See available flags at :class:`~pydra.utils.messenger.Messenger`.
72+
messengers_args : dict[str, Any], optional
73+
Argument(s) used by `messegner`. Saved in the `audit` attribu
7274
clean_stale_locks : bool, optional
7375
Whether to clean stale lock files, i.e. lock files that were created before the
7476
start of the current run. Don't set if using a global cache where there are

0 commit comments

Comments
 (0)