-
Notifications
You must be signed in to change notification settings - Fork 747
Description
Spun off from #2723
Params can currently be defined in config files (including profiles), params files, CLI options, and the pipeline code itself. This creates the potential for much confusion around how these various sources are resolved (see #2662). Additionally, params are not typed, and while the CLI can cast command line params based on regular expressions, it can also backfire when e.g. a string param is given a value that "looks" like a number.
Instead, params should be defined in a single place with metadata such as type, default value, description, etc. Benefits are:
- single source of truth
- less ambiguity of how params are resolved
- ability to validate params based on type definition instead of regex guessing
The nf-core parameter schema (nextflow_schema.json) as well as the nf-validation plugin are excellent steps in this direction, and the solution may be to simply incorporate them into Nextflow.
For backwards compatibility, we may allow params to be set in config files and pipeline code, but this would essentially be overriding the default value rather than "defining" the param, and it should be discouraged in favor of putting everything in the parameter schema. That being said, it can be useful to set params from a profile, such as a test profile that provides some test data, so this use case should be supported.
The main question that I see is whether the schema should be in a separate JSON/YAML file (as it currently is in nf-core) or in the pipeline code as part of the top-level workflow definition.
- I would like the latter approach because it makes the top-level workflow more of a "unit" and makes it easier for IDE tooling to validate param references in the pipeline code. It would likely be less verbose than a JSON schema.
- On the other hand, a JSON schema can be parsed by external tools written in other languages whereas Nextflow scripts can only be parsed by Nextflow (and any IDE tooling)
- For what it's worth, Nextflow could export the workflow inputs definition to a JSON file for use with other tools, but then we have to keep it in sync with the pipeline code somehow