|
6 | 6 | "source": [ |
7 | 7 | "# Getting started\n", |
8 | 8 | "\n", |
9 | | - "A *Task* is the basic runnable component in Pydra, and can execute either a Python function,\n", |
10 | | - "shell command or workflows consisting of combinations of all three types." |
| 9 | + "## Running your first task\n", |
| 10 | + "\n", |
| 11 | + "The basic runnable component of Pydra is a *task*. Tasks are conceptually similar to\n", |
| 12 | + "functions, in that they take inputs, process them and then return results. However,\n", |
| 13 | + "unlike functions, tasks are parameterised before they are executed in a separate step.\n", |
| 14 | + "This enables parameterised tasks to be linked together into workflows that are checked for\n", |
| 15 | + "errors before they are executed, and modular execution workers and environments to specified\n", |
| 16 | + "independently of the task being performed.\n", |
| 17 | + "\n", |
| 18 | + "Pre-defined task definitions are installed under the `pydra.tasks.*` namespace by separate\n", |
| 19 | + "task packages (e.g. `pydra-fsl`, `pydra-ants`, ...). Pre-define task definitions are run by\n", |
| 20 | + "\n", |
| 21 | + "* importing the class from the `pydra.tasks.*` package it is in\n", |
| 22 | + "* instantiate the class with the parameters of the task\n", |
| 23 | + "* \"call\" resulting object to execute it as you would a function (i.e. with the `my_task(...)`)\n", |
| 24 | + "\n", |
| 25 | + "To demonstrate with a toy example, of loading a JSON file with the `pydra.tasks.common.LoadJson` task, this we first create an example JSON file" |
11 | 26 | ] |
12 | 27 | }, |
13 | 28 | { |
14 | 29 | "cell_type": "code", |
15 | | - "execution_count": 5, |
| 30 | + "execution_count": 6, |
16 | 31 | "metadata": {}, |
17 | | - "outputs": [ |
18 | | - { |
19 | | - "name": "stdout", |
20 | | - "output_type": "stream", |
21 | | - "text": [ |
22 | | - "Sample JSON file created at '0UAqFzWsDK4FrUMp48Y3tT3Q.json' with contents: {\"a\": true, \"b\": \"two\", \"c\": 3, \"d\": [7, 0.5598136790149003, 6]}\n", |
23 | | - "Loaded contents: {'a': True, 'b': 'two', 'c': 3, 'd': [7, 0.5598136790149003, 6]}\n" |
24 | | - ] |
25 | | - } |
26 | | - ], |
| 32 | + "outputs": [], |
27 | 33 | "source": [ |
28 | | - "from fileformats.application import Json\n", |
29 | | - "from pydra.tasks.common import LoadJson\n", |
| 34 | + "from pathlib import Path\n", |
| 35 | + "from tempfile import mkdtemp\n", |
| 36 | + "import json\n", |
30 | 37 | "\n", |
31 | | - "# Create a sample JSON file to test\n", |
32 | | - "json_file = Json.sample()\n", |
| 38 | + "JSON_CONTENTS = {'a': True, 'b': 'two', 'c': 3, 'd': [7, 0.5598136790149003, 6]}\n", |
33 | 39 | "\n", |
34 | | - "# Print the path of the sample JSON file and its contents for reference\n", |
35 | | - "print(f\"Sample JSON file created at {json_file.name!r} with contents: {json_file.read_text()}\")\n", |
| 40 | + "test_dir = Path(mkdtemp())\n", |
| 41 | + "json_file = test_dir / \"test.json\"\n", |
| 42 | + "with open(json_file, \"w\") as f:\n", |
| 43 | + " json.dump(JSON_CONTENTS, f)" |
| 44 | + ] |
| 45 | + }, |
| 46 | + { |
| 47 | + "cell_type": "markdown", |
| 48 | + "metadata": {}, |
| 49 | + "source": [ |
| 50 | + "Now we can load the JSON contents back from the file using the `LoadJson` task definition\n", |
| 51 | + "class" |
| 52 | + ] |
| 53 | + }, |
| 54 | + { |
| 55 | + "cell_type": "code", |
| 56 | + "execution_count": 7, |
| 57 | + "metadata": {}, |
| 58 | + "outputs": [], |
| 59 | + "source": [ |
| 60 | + "# Import the task definition\n", |
| 61 | + "from pydra.tasks.common import LoadJson\n", |
36 | 62 | "\n", |
37 | | - "# Parameterise the task specification to load the JSON file\n", |
| 63 | + "# Instantiate the task definition, providing the JSON file we want to load\n", |
38 | 64 | "load_json = LoadJson(file=json_file)\n", |
39 | 65 | "\n", |
40 | 66 | "# Run the task to load the JSON file\n", |
41 | 67 | "result = load_json()\n", |
42 | 68 | "\n", |
43 | | - "# Print the output interface of the of the task (LoadJson.Outputs)\n", |
44 | | - "print(f\"Loaded contents: {result.output.out}\")" |
| 69 | + "# Access the loaded JSON output contents and check they match original\n", |
| 70 | + "assert result.output.out == JSON_CONTENTS" |
| 71 | + ] |
| 72 | + }, |
| 73 | + { |
| 74 | + "cell_type": "markdown", |
| 75 | + "metadata": {}, |
| 76 | + "source": [ |
| 77 | + "## Iterating over inputs\n", |
| 78 | + "\n", |
| 79 | + "It is straightforward to apply the same operation over a set of inputs using the `split()`\n", |
| 80 | + "method. For example, if we wanted to re-grid all the NIfTI images stored in a directory,\n", |
| 81 | + "such as the sample ones generated by the code below" |
| 82 | + ] |
| 83 | + }, |
| 84 | + { |
| 85 | + "cell_type": "code", |
| 86 | + "execution_count": null, |
| 87 | + "metadata": {}, |
| 88 | + "outputs": [], |
| 89 | + "source": [ |
| 90 | + "from fileformats.medimage import Nifti\n", |
| 91 | + "\n", |
| 92 | + "nifti_dir = test_dir / \"nifti\"\n", |
| 93 | + "nifti_dir.mkdir()\n", |
| 94 | + "\n", |
| 95 | + "for i in range(10):\n", |
| 96 | + " Nifti.sample(nifti_dir, seed=i)" |
| 97 | + ] |
| 98 | + }, |
| 99 | + { |
| 100 | + "cell_type": "markdown", |
| 101 | + "metadata": {}, |
| 102 | + "source": [ |
| 103 | + "Then we can by importing the `MrGrid` shell-command task from the `pydra-mrtrix3` package\n", |
| 104 | + "and then splitting over the list of files in the directory" |
| 105 | + ] |
| 106 | + }, |
| 107 | + { |
| 108 | + "cell_type": "code", |
| 109 | + "execution_count": null, |
| 110 | + "metadata": {}, |
| 111 | + "outputs": [], |
| 112 | + "source": [ |
| 113 | + "from pydra.tasks.mrtrix3 import MrGrid\n", |
| 114 | + "\n", |
| 115 | + "# Instantiate the task definition, \"splitting\" over all NIfTI files in the test directory\n", |
| 116 | + "mrgrid = MrGrid(voxel=0.5).split(input=nifti_dir.iterdir())\n", |
| 117 | + "\n", |
| 118 | + "# Run the task to resample all NIfTI files\n", |
| 119 | + "result = mrgrid()\n", |
| 120 | + "\n", |
| 121 | + "# Print the locations of the output files\n", |
| 122 | + "print(\"\\n\".join(str(p) for p in result.output.output))" |
| 123 | + ] |
| 124 | + }, |
| 125 | + { |
| 126 | + "cell_type": "markdown", |
| 127 | + "metadata": {}, |
| 128 | + "source": [ |
| 129 | + "It is also possible to iterate over inputs in pairs, if for example you wanted to use\n", |
| 130 | + "different voxel sizes for different images, both the list of images and the voxel sizes\n", |
| 131 | + "are passed to the `split()` method and their combination is specified by a tuple \"splitter\"\n", |
| 132 | + "(see [Splitting and combining](../explanation/splitting-combining.html) for more details\n", |
| 133 | + "on splitters)" |
| 134 | + ] |
| 135 | + }, |
| 136 | + { |
| 137 | + "cell_type": "code", |
| 138 | + "execution_count": null, |
| 139 | + "metadata": {}, |
| 140 | + "outputs": [], |
| 141 | + "source": [ |
| 142 | + "# Define a list of voxel sizes to resample the NIfTI files to, must be the same length\n", |
| 143 | + "# as the number of NIfTI files\n", |
| 144 | + "VOXEL_SIZES = [0.5, 0.5, 0.5, 0.75, 0.75, 0.75, 1.0, 1.0, 1.0, 1.25]\n", |
| 145 | + "\n", |
| 146 | + "mrgrid_varying_sizes = MrGrid().split(\n", |
| 147 | + " (\"input\", \"voxel\"),\n", |
| 148 | + " input=nifti_dir.iterdir(),\n", |
| 149 | + " voxel=VOXEL_SIZES\n", |
| 150 | + ")\n", |
| 151 | + "\n", |
| 152 | + "# Run the task to resample all NIfTI files with different voxel sizes\n", |
| 153 | + "result = mrgrid()" |
| 154 | + ] |
| 155 | + }, |
| 156 | + { |
| 157 | + "cell_type": "markdown", |
| 158 | + "metadata": {}, |
| 159 | + "source": [ |
| 160 | + "## Cache directories\n", |
| 161 | + "\n", |
| 162 | + "When a task runs, a hash is generated by the combination of all the inputs to the task and the task to be run." |
45 | 163 | ] |
46 | 164 | }, |
47 | 165 | { |
|
0 commit comments