Skip to content

Commit aec741f

Browse files
authored
extract/execute workflow subgraph (#949)
* Command to extract a subgraph from a workflow. * New options: --print-subgraph, --print--targets and --target * Extracting a step connects workflow inputs to satisfy dependencies This means in the extracted workflow, execution will start from the extracted step (provided dependencies are supplied in the input object.) * Reorganized README. Documentation for --target and related features. Improve --print-targets Hopefully fix tests.
1 parent 947bb58 commit aec741f

23 files changed

+1039
-98
lines changed

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ include tests/tmp1/tmp2/tmp3/.gitkeep
66
include tests/wf/*
77
include tests/override/*
88
include tests/checker_wf/*
9+
include tests/subgraph/*
910
include cwltool/schemas/v1.0/*.yml
1011
include cwltool/schemas/v1.0/*.yml
1112
include cwltool/schemas/v1.0/*.md

README.rst

Lines changed: 165 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -67,37 +67,9 @@ Remember, if co-installing multiple CWL implementations then you need to
6767
maintain which implementation ``cwl-runner`` points to via a symbolic file
6868
system link or `another facility <https://wiki.debian.org/DebianAlternatives>`_.
6969

70-
Running tests locally
71-
---------------------
72-
73-
- Running basic tests ``(/tests)``:
74-
75-
To run the basis tests after installing `cwltool` execute the following:
76-
77-
.. code:: bash
78-
79-
pip install -rtest-requirements.txt
80-
py.test --ignore cwltool/schemas/ --pyarg cwltool
81-
82-
To run various tests in all supported Python environments we use `tox <https://github.com/common-workflow-language/cwltool/tree/master/tox.ini>`_. To run the test suite in all supported Python environments
83-
first downloading the complete code repository (see the ``git clone`` instructions above) and then run
84-
the following in the terminal:
85-
``pip install tox; tox``
86-
87-
List of all environment can be seen using:
88-
``tox --listenvs``
89-
and running a specfic test env using:
90-
``tox -e <env name>``
91-
and additionally run a specific test using this format:
92-
``tox -e py36-unit -- tests/test_examples.py::TestParamMatching``
93-
94-
- Running the entire suite of CWL conformance tests:
95-
96-
The GitHub repository for the CWL specifications contains a script that tests a CWL
97-
implementation against a wide array of valid CWL files using the `cwltest <https://github.com/common-workflow-language/cwltest>`_
98-
program
99-
100-
Instructions for running these tests can be found in the Common Workflow Language Specification repository at https://github.com/common-workflow-language/common-workflow-language/blob/master/CONFORMANCE_TESTS.md
70+
=====
71+
Usage
72+
=====
10173

10274
Run on the command line
10375
-----------------------
@@ -158,8 +130,8 @@ To use Singularity as the Docker container runtime, provide ``--singularity`` co
158130
159131
cwltool --singularity https://raw.githubusercontent.com/common-workflow-language/common-workflow-language/master/v1.0/v1.0/v1.0/cat3-tool-mediumcut.cwl https://github.com/common-workflow-language/common-workflow-language/blob/master/v1.0/v1.0/cat-job.json
160132
161-
Tool or workflow loading from remote or local locations
162-
-------------------------------------------------------
133+
Running a tool or workflow from remote or local locations
134+
---------------------------------------------------------
163135

164136
``cwltool`` can run tool and workflow descriptions on both local and remote
165137
systems via its support for HTTP[S] URLs.
@@ -170,52 +142,117 @@ is referenced and that document isn't found in the current directory then the
170142
following locations will be searched:
171143
http://www.commonwl.org/v1.0/CommandLineTool.html#Discovering_CWL_documents_on_a_local_filesystem
172144

145+
You can also use `cwldep <https://github.com/common-workflow-language/cwldep>`
146+
to manage dependencies on external tools and workflows.
173147

174-
Use with GA4GH Tool Registry API
175-
--------------------------------
148+
Overriding workflow requirements at load time
149+
---------------------------------------------
176150

177-
Cwltool can launch tools directly from `GA4GH Tool Registry API`_ endpoints.
151+
Sometimes a workflow needs additional requirements to run in a particular
152+
environment or with a particular dataset. To avoid the need to modify the
153+
underlying workflow, cwltool supports requirement "overrides".
178154

179-
By default, cwltool searches https://dockstore.org/ . Use ``--add-tool-registry`` to add other registries to the search path.
155+
The format of the "overrides" object is a mapping of item identifier (workflow,
156+
workflow step, or command line tool) to the process requirements that should be applied.
180157

181-
For example ::
158+
.. code:: yaml
182159
183-
cwltool quay.io/collaboratory/dockstore-tool-bamstats:develop test.json
160+
cwltool:overrides:
161+
echo.cwl:
162+
requirements:
163+
EnvVarRequirement:
164+
envDef:
165+
MESSAGE: override_value
184166
185-
and (defaults to latest when a version is not specified) ::
167+
Overrides can be specified either on the command line, or as part of the job
168+
input document. Workflow steps are identified using the name of the workflow
169+
file followed by the step name as a document fragment identifier "#id".
170+
Override identifiers are relative to the toplevel workflow document.
186171

187-
cwltool quay.io/collaboratory/dockstore-tool-bamstats test.json
172+
.. code:: bash
188173
189-
For this example, grab the test.json (and input file) from https://github.com/CancerCollaboratory/dockstore-tool-bamstats ::
174+
cwltool --overrides overrides.yml my-tool.cwl my-job.yml
190175
191-
wget https://dockstore.org/api/api/ga4gh/v2/tools/quay.io%2Fbriandoconnor%2Fdockstore-tool-bamstats/versions/develop/PLAIN-CWL/descriptor/test.json
192-
wget https://github.com/CancerCollaboratory/dockstore-tool-bamstats/raw/develop/rna.SRR948778.bam
193-
176+
.. code:: yaml
194177
195-
.. _`GA4GH Tool Registry API`: https://github.com/ga4gh/tool-registry-schemas
178+
input_parameter1: value1
179+
input_parameter2: value2
180+
cwltool:overrides:
181+
workflow.cwl#step1:
182+
requirements:
183+
EnvVarRequirement:
184+
envDef:
185+
MESSAGE: override_value
196186
197-
Import as a module
198-
------------------
187+
.. code:: bash
199188
200-
Add
189+
cwltool my-tool.cwl my-job-with-overrides.yml
201190
202-
.. code:: python
203191
204-
import cwltool
192+
Combining parts of a workflow into a single document
193+
----------------------------------------------------
205194

206-
to your script.
195+
Use ``--pack`` to combine a workflow made up of multiple files into a
196+
single compound document. This operation takes all the CWL files
197+
referenced by a workflow and builds a new CWL document with all
198+
Process objects (CommandLineTool and Workflow) in a list in the
199+
``$graph`` field. Cross references (such as ``run:`` and ``source:``
200+
fields) are updated to internal references within the new packed
201+
document. The top level workflow is named ``#main``.
207202

208-
The easiest way to use cwltool to run a tool or workflow from Python is to use a Factory
203+
.. code:: bash
209204
210-
.. code:: python
205+
cwltool --pack my-wf.cwl > my-packed-wf.cwl
211206
212-
import cwltool.factory
213-
fac = cwltool.factory.Factory()
214207
215-
echo = fac.make("echo.cwl")
216-
result = echo(inp="foo")
208+
Running only part of a workflow
209+
-------------------------------
210+
211+
You can run a partial workflow with the ``--target`` (``-t``) option. This
212+
takes the name of an output parameter, workflow step, or input
213+
parameter in the top level workflow. You may provide multiple
214+
targets.
215+
216+
.. code:: bash
217+
218+
cwltool --target step3 my-wf.cwl
219+
220+
If a target is an output parameter, it will only run only the steps
221+
that contribute to that output. If a target is a workflow step, it
222+
will run the workflow starting from that step. If a target is an
223+
input parameter, it will only run only the steps that are connected to
224+
that input.
225+
226+
Use ``--print-targets`` to get a listing of the targets of a workflow.
227+
To see exactly which steps will run, use ``--print-subgraph`` with
228+
``--target`` to get a printout of the workflow subgraph for the
229+
selected targets.
230+
231+
.. code:: bash
232+
233+
cwltool --print-targets my-wf.cwl
234+
235+
cwltool --target step3 --print-subgraph my-wf.cwl > my-wf-starting-from-step3.cwl
236+
237+
238+
Visualizing a CWL document
239+
--------------------------
240+
241+
The ``--print-dot`` option will print a file suitable for Graphviz ``dot`` program. Here is a bash onliner to generate a Scalable Vector Graphic (SVG) file:
242+
243+
.. code:: bash
244+
245+
cwltool --print-dot my-wf.cwl | dot -Tsvg > my-wf.svg
246+
247+
Modeling a CWL document as RDF
248+
------------------------------
249+
250+
CWL documents can be expressed as RDF triple graphs.
251+
252+
.. code:: bash
253+
254+
cwltool --print-rdf --rdf-serializer=turtle mywf.cwl
217255
218-
# result["out"] == "foo"
219256
220257
Leveraging SoftwareRequirements (Beta)
221258
--------------------------------------
@@ -425,48 +462,87 @@ at the following links:
425462
- `Specifications - Implementation <https://github.com/galaxyproject/galaxy/commit/81d71d2e740ee07754785306e4448f8425f890bc>`__
426463
- `Initial cwltool Integration Pull Request <https://github.com/common-workflow-language/cwltool/pull/214>`__
427464

428-
Overriding workflow requirements at load time
429-
---------------------------------------------
465+
Use with GA4GH Tool Registry API
466+
--------------------------------
430467

431-
Sometimes a workflow needs additional requirements to run in a particular
432-
environment or with a particular dataset. To avoid the need to modify the
433-
underlying workflow, cwltool supports requirement "overrides".
468+
Cwltool can launch tools directly from `GA4GH Tool Registry API`_ endpoints.
434469

435-
The format of the "overrides" object is a mapping of item identifier (workflow,
436-
workflow step, or command line tool) to the process requirements that should be applied.
470+
By default, cwltool searches https://dockstore.org/ . Use ``--add-tool-registry`` to add other registries to the search path.
437471

438-
.. code:: yaml
472+
For example ::
439473

440-
cwltool:overrides:
441-
echo.cwl:
442-
requirements:
443-
EnvVarRequirement:
444-
envDef:
445-
MESSAGE: override_value
474+
cwltool quay.io/collaboratory/dockstore-tool-bamstats:develop test.json
446475

447-
Overrides can be specified either on the command line, or as part of the job
448-
input document. Workflow steps are identified using the name of the workflow
449-
file followed by the step name as a document fragment identifier "#id".
450-
Override identifiers are relative to the toplevel workflow document.
476+
and (defaults to latest when a version is not specified) ::
451477

452-
.. code:: bash
478+
cwltool quay.io/collaboratory/dockstore-tool-bamstats test.json
453479

454-
cwltool --overrides overrides.yml my-tool.cwl my-job.yml
480+
For this example, grab the test.json (and input file) from https://github.com/CancerCollaboratory/dockstore-tool-bamstats ::
455481

456-
.. code:: yaml
482+
wget https://dockstore.org/api/api/ga4gh/v2/tools/quay.io%2Fbriandoconnor%2Fdockstore-tool-bamstats/versions/develop/PLAIN-CWL/descriptor/test.json
483+
wget https://github.com/CancerCollaboratory/dockstore-tool-bamstats/raw/develop/rna.SRR948778.bam
457484

458-
input_parameter1: value1
459-
input_parameter2: value2
460-
cwltool:overrides:
461-
workflow.cwl#step1:
462-
requirements:
463-
EnvVarRequirement:
464-
envDef:
465-
MESSAGE: override_value
485+
486+
.. _`GA4GH Tool Registry API`: https://github.com/ga4gh/tool-registry-schemas
487+
488+
===========
489+
Development
490+
===========
491+
492+
Running tests locally
493+
---------------------
494+
495+
- Running basic tests ``(/tests)``:
496+
497+
To run the basis tests after installing `cwltool` execute the following:
466498

467499
.. code:: bash
468500
469-
cwltool my-tool.cwl my-job-with-overrides.yml
501+
pip install -rtest-requirements.txt
502+
py.test --ignore cwltool/schemas/ --pyarg cwltool
503+
504+
To run various tests in all supported Python environments we use `tox <https://github.com/common-workflow-language/cwltool/tree/master/tox.ini>`_. To run the test suite in all supported Python environments
505+
first downloading the complete code repository (see the ``git clone`` instructions above) and then run
506+
the following in the terminal:
507+
``pip install tox; tox``
508+
509+
List of all environment can be seen using:
510+
``tox --listenvs``
511+
and running a specfic test env using:
512+
``tox -e <env name>``
513+
and additionally run a specific test using this format:
514+
``tox -e py36-unit -- tests/test_examples.py::TestParamMatching``
515+
516+
- Running the entire suite of CWL conformance tests:
517+
518+
The GitHub repository for the CWL specifications contains a script that tests a CWL
519+
implementation against a wide array of valid CWL files using the `cwltest <https://github.com/common-workflow-language/cwltest>`_
520+
program
521+
522+
Instructions for running these tests can be found in the Common Workflow Language Specification repository at https://github.com/common-workflow-language/common-workflow-language/blob/master/CONFORMANCE_TESTS.md
523+
524+
Import as a module
525+
------------------
526+
527+
Add
528+
529+
.. code:: python
530+
531+
import cwltool
532+
533+
to your script.
534+
535+
The easiest way to use cwltool to run a tool or workflow from Python is to use a Factory
536+
537+
.. code:: python
538+
539+
import cwltool.factory
540+
fac = cwltool.factory.Factory()
541+
542+
echo = fac.make("echo.cwl")
543+
result = echo(inp="foo")
544+
545+
# result["out"] == "foo"
470546
471547
472548
CWL Tool Control Flow

cwltool/argparser.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,10 @@ def arg_parser(): # type: () -> argparse.ArgumentParser
167167
exgroup.add_argument("--version", action="store_true", help="Print version and exit")
168168
exgroup.add_argument("--validate", action="store_true", help="Validate CWL document only.")
169169
exgroup.add_argument("--print-supported-versions", action="store_true", help="Print supported CWL specs.")
170+
exgroup.add_argument("--print-subgraph", action="store_true",
171+
help="Print workflow subgraph that will execute "
172+
"(can combine with --target)")
173+
exgroup.add_argument("--print-targets", action="store_true", help="Print targets (output parameters)")
170174

171175
exgroup = parser.add_mutually_exclusive_group()
172176
exgroup.add_argument("--strict", action="store_true",
@@ -293,6 +297,10 @@ def arg_parser(): # type: () -> argparse.ArgumentParser
293297
parser.add_argument("--overrides", type=str,
294298
default=None, help="Read process requirement overrides from file.")
295299

300+
parser.add_argument("--target", "-t", action="append",
301+
help="Only execute steps that contribute to "
302+
"listed targets (can provide more than once).")
303+
296304
parser.add_argument("workflow", type=Text, nargs="?", default=None,
297305
metavar='cwl_document', help="path or URL to a CWL Workflow, "
298306
"CommandLineTool, or ExpressionTool. If the `inputs_object` has a "

cwltool/load_tool.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,7 @@ def validate_document(document_loader, # type: Loader
315315
def make_tool(document_loader, # type: Loader
316316
avsc_names, # type: schema.Names
317317
metadata, # type: Dict[Text, Any]
318-
uri, # type: Text
318+
uri, # type: Union[Text, CommentedMap, CommentedSeq]
319319
loadingContext # type: LoadingContext
320320
): # type: (...) -> Process
321321
"""Make a Python CWL object."""

0 commit comments

Comments
 (0)