|
| 1 | +--- |
| 2 | +layout: page |
| 3 | +title: "Recommended Practices" |
| 4 | +permalink: /rec-practices/ |
| 5 | +--- |
| 6 | + |
| 7 | +Below are a set of recommended good practices to keep in mind when writing a Common Workflow Language description for a tool or workflow. These guidelines are presented for consideration on a scale of usefulness: more is better, not all are required. |
| 8 | + |
| 9 | +☐ No `type: string` descriptions for names of input or reference files/directories. |
| 10 | + |
| 11 | +☐ Include a license that allows for re-use by anyone, e.g. [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0#apply). [Example of license inclusion](https://github.com/ProteinsWebTeam/ebi-metagenomics-cwl/blob/master/workflows/emg-assembly.cwl#L200). |
| 12 | + |
| 13 | +☐ Include [attribution information](https://github.com/ProteinsWebTeam/ebi-metagenomics-cwl/blob/master/workflows/emg-assembly.cwl#L200) for the CWL description author(s). |
| 14 | + |
| 15 | +☐ In tool descriptions, list dependencies using short name(s) under `SoftwareRequirement`. |
| 16 | + |
| 17 | +☐ Include [SciCrunch](https://github.com/common-workflow-language/common-workflow-language/issues/scicrunch.org) identifiers for dependencies in `https://identifiers.org/rrid/RRID:SCR_NNNNNN` format. |
| 18 | + |
| 19 | +☐ All `input` and `output` identifiers should reflect their conceptual identity. No `foo_input`, `foo_file`, `result`, `input`, `output`, or other uninformative names. |
| 20 | + |
| 21 | +☐ In tool descriptions, include a list of version(s) of the tool that are known to work. |
| 22 | + |
| 23 | +☐ `format` should be specified for all input and output `File`s. Bioinformatics tools should use format identifiers from [EDAM](http://edamontology.org/format_1915). See also `iana:text/plain`, `iana:text/tab-separated-values` with `$namespaces: { iana: "https://www.iana.org/assignments/media-types/" }`. [Full IANA media type list](http://www.iana.org/assignments/media-types/media-types.xhtml) (also known as MIME types). |
| 24 | + |
| 25 | +☐ Mark all input and output `File`s that are read or written to in a streaming compatible way (once, no random-access), as `streamable: true`. |
| 26 | + |
| 27 | +☐ Each `CommandLineTool` description should focus on a single operation only, even if the (sub)command is capable of more. |
| 28 | + |
| 29 | +☐ Custom types should be defined with one external YAML per type definition for re-use. |
| 30 | + |
| 31 | +☐ Include a top level short `label` summarising the tool/workflow. |
| 32 | + |
| 33 | +☐ If useful, include a top level `doc` as well. This should provide a longer, more detailed description than was provided in the top level `label` (see above). |
| 34 | + |
| 35 | +☐ Use `type: enum` instead of `type: string` for elements with a fixed list of valid values. |
| 36 | + |
| 37 | +☐ Evaluate all use of JavaScript for possible elimination or replacement. One common example: manipulating `File` names and paths? Consider whether one of the [built in `File` properties](http://www.commonwl.org/v1.0/CommandLineTool.html#File) like `basename`, `nameroot`, `nameext`, etc, could be used instead. |
| 38 | + |
| 39 | +☐ Give the tool description to a colleague at a different institution to test and provide feedback. |
0 commit comments