Skip to content

Commit bfa67ca

Browse files
Unify nf-lang config scopes with runtime classes (#6271)
--------- Signed-off-by: Ben Sherman <[email protected]> Signed-off-by: Paolo Di Tommaso <[email protected]> Co-authored-by: Paolo Di Tommaso <[email protected]>
1 parent a76b760 commit bfa67ca

File tree

247 files changed

+5831
-6562
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

247 files changed

+5831
-6562
lines changed

docs/developer/config-scopes.md

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
(config-scopes-page)=
2+
3+
# Configuration scopes
4+
5+
This page provides guidance on defining configuration scopes in the Nextflow runtime.
6+
7+
## Overview
8+
9+
The Nextflow configuration is defined as a collection of *scope classes*. Each scope class defines the set of available options, including their name, type, and an optional description for a specific configuration scope.
10+
11+
Scope classes are used to generate a configuration schema, which is in turn used for several purposes:
12+
13+
- Validating config options at runtime (`nextflow run` and `nextflow config`)
14+
15+
- Providing code intelligence in the language server (validation, hover hints, code completion)
16+
17+
- Generating reference documentation (in progress)
18+
19+
Scope classes are also used by the runtime itself as type-safe domain objects. This way, the construciton of domain objects from the configuration map is isolated from the rest of the runtime.
20+
21+
## Definition
22+
23+
### Config scopes
24+
25+
A *config scope* is defined as a class that implements the `ConfigScope` interface. Top-level scope classes must have the `@ScopeName` annotation, which defines the name of the config scope.
26+
27+
For example:
28+
29+
```groovy
30+
package nextflow.hello
31+
32+
import nextflow.config.schema.ConfigScope
33+
import nextflow.config.schema.ScopeName
34+
35+
@ScopeName('hello')
36+
class HelloConfig implements ConfigScope {
37+
}
38+
```
39+
40+
A scope class must provide a default constructor, so that it can be instantiated as an extension point. If no such constructor is defined, the config scope will not be included in the schema. In the above example, this constructor is implicitly defined because no constructors were declared.
41+
42+
The fully-qualified class name (in this case, `nextflow.hello.HelloConfig`) must be included in the list of extension points.
43+
44+
### Config options
45+
46+
A *config option* is defined as a field with the `@ConfigOption` annotation. The field name determines the name of the config option.
47+
48+
For example:
49+
50+
```groovy
51+
@ConfigOption
52+
String createMessage
53+
```
54+
55+
The `@ConfigOption` annotation can specify an optional set of types that are valid in addition to the field type. For example, the `fusion.tags` option, which accepts either a String or Boolean, is declared as follows:
56+
57+
```groovy
58+
@ConfigOption(types=[Boolean])
59+
String tags
60+
```
61+
62+
The field type and any additional types are included in the schema, allowing them to be used for validation.
63+
64+
The field type can be any Java or Groovy class, but in practice it should be a class that can be constructed from primitive values (numbers, booleans, strings). For example, `Duration` and `MemoryUnit` are standard Nextflow types that can each be constructed from an integer or string.
65+
66+
### Nested scopes
67+
68+
A *nested scope* is defined as a field whose type is an implementation of `ConfigScope`. The field name determines the name of the nested scope.
69+
70+
The scope class referenced by the field type defines config options and scopes in the same manner as top-level scope classes. Unlike top-level scopes, nested scope classes do not need to use the `@ScopeName` annotation or provide a default constructor.
71+
72+
See `ExecutorConfig` and `ExecutorRetryConfig` for an example of how a nested scope is defined and constructed.
73+
74+
### Placeholder scopes
75+
76+
A *placeholder scope* is a config scope that applies to a collection of user-defined names.
77+
78+
For example, the `azure.batch.pools` scope allows the user to define a set of named pools, where each pool is configured with a standard set of options such as `autoScale`, `lowPriority`, `maxVmCount`, etc. These options are defined in a placeholder scope with a placeholder name of `<name>`. Thus, the generic name for the `autoScale` option is `azure.batch.pools.<name>.autoScale`.
79+
80+
A placeholder scope is defined as a field with type `Map<String, P>`, where `P` is a nested scope class which defines the scope options. The field should have the `@PlaceholderName` annotation which defines the placeholder name (e.g. `<name>`).
81+
82+
See `AzBatchOpts` and `AzPoolOpts` for an example of how placeholder scopes are defined and constructed.
83+
84+
### Descriptions
85+
86+
Top-level scope classes and config options should use the `@Description` annotation to provide a description of the scope or option. This description is included in the schema, which is in turn used by the language server to provide hover hints.
87+
88+
For example:
89+
90+
```groovy
91+
@ScopeName('hello')
92+
@Description('''
93+
The `hello` scope controls the behavior of the `nf-hello` plugin.
94+
''')
95+
class HelloConfig implements ConfigScope {
96+
97+
@ConfigOption
98+
@Description('''
99+
Message to print to standard output when a run is initialized.
100+
''')
101+
String createMessage
102+
}
103+
```
104+
105+
Nested scopes and placeholder scopes may also use this annotation, but will inherit the description of the top-level scope by default.
106+
107+
### Best practices
108+
109+
The Nextflow runtime adheres the following best practices where appropriate:
110+
111+
- Config options should be declared as public and final, so that the scope class can be used as an immutable domain object.
112+
113+
- Scope classes should define a constructor that initializes each field from a map, casting each map property to the required type and providing default values as needed.
114+
115+
- In cases where an option defaults to an environment variable, the environment map should be provided as an additional constructor argument rather than accessing the system environment directly.
116+
117+
- In cases where an option with a primitive type (e.g., `int`, `float`, `boolean`) can be unspecified without a default value, it should be declared with the equivalent reference type (e.g. `Integer`, `Float`, `Boolean`), otherwise it should use the primitive type.
118+
119+
- In cases where an option represents a path, it should be declared as a `String` and allow clients to construct paths as needed, since path construction may depend on plugins which aren't yet loaded.
120+
121+
For example:
122+
123+
```groovy
124+
import nextflow.config.schema.ConfigOption
125+
import nextflow.config.schema.ConfigScope
126+
import nextflow.config.schema.ScopeName
127+
128+
@ScopeName('hello')
129+
class HelloConfig implements ConfigScope {
130+
131+
@ConfigOption
132+
final String createMessage
133+
134+
@ConfigOption
135+
final boolean verbose
136+
137+
HelloConfig() {}
138+
139+
HelloConfig(Map opts, Map env) {
140+
this.createMessage = opts.createMessage ?: env.get('NXF_HELLO_CREATE_MESSAGE')
141+
this.verbose = opts.verbose as boolean
142+
}
143+
}
144+
```
145+
146+
## Usage
147+
148+
### Runtime
149+
150+
Nextflow validates the config map after it is loaded. Top-level config scopes are loaded by the plugin system as extension points and converted into a schema, which is used to validate the config map.
151+
152+
Plugins are loaded after the config is loaded and before it is validated, since plugins can also define config scopes. If a third-party plugin declares a config scope, it must be explicitly enabled in order to validate config options from the plugin. Otherwise, Nextflow will report these options as unrecognized.
153+
154+
Core plugins are loaded automatically based on other config options. Therefore, Nextflow only validates config from a core plugin when that plugin is loaded. Otherwise, any config options from the plugin are ignored -- they are neither validated nor reported as unrecognized.
155+
156+
For example, when the `process.executor` config option is set to `'awsbatch'`, the `nf-amazon` is automatically loaded. In this case, all options in the `aws` config scope will be validated. If the executor is not set to `'awsbatch'`, all `aws` options will be ignored. This way, config files can be validated appropriately without loading additional core plugins that won't be used by the run.
157+
158+
The scope classes themselves can be used to construct domain objects on-demand from the config map. For example, an `ExecutorConfig` can be constructed from the `executor` config scope as follows:
159+
160+
```groovy
161+
new ExecutorConfig( Global.session.config.executor as Map ?: Collections.emptyMap() )
162+
```
163+
164+
:::{note}
165+
In practice, it is better to avoid the use of `Global` and provide an instance of `Session` to the client class instead.
166+
:::
167+
168+
### JSON Schema
169+
170+
Config scope classes can be converted into a schema with the `SchemaNode` class, which uses reflection to extract metadata such as scope names, option names, types, and descriptions. This schema is rendered to JSON and used by the language server at build-time to provide code intelligence such as code completion and hover hints.
171+
172+
### Documentation
173+
174+
The schema described above can also be rendered to Markdown using the `MarkdownRenderer` class. It produces a Markdown document approximating the {ref}`config-options` page.
175+
176+
This approach to docs generation is not yet complete, and has not been incorporated into the build process yet. However, it can be used to check for discrepancies between the source code and docs when making changes. The documentation should match the `@Description` annotations as closely as possible, but may contain additional details such as version notes and extra paragraphs.

docs/developer/plugins.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,9 @@ import nextflow.script.dsl.Description
133133
''')
134134
class MyPluginConfig implements ConfigScope {
135135
136+
/* required by extension point -- do not remove */
137+
MyPluginConfig() {}
138+
136139
MyPluginConfig(Map opts) {
137140
this.createMessage = opts.createMessage
138141
}
@@ -143,7 +146,7 @@ class MyPluginConfig implements ConfigScope {
143146
}
144147
```
145148

146-
While this approach is not required to support plugin config options, it allows Nextflow to recognize plugin definitions when validating the config.
149+
While this approach is not required to support plugin config options, it allows Nextflow to recognize plugin definitions when validating the config. See {ref}`config-scopes-page` for more information.
147150

148151
### Executors
149152

docs/google.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
# Google Cloud
44

5+
(google-credentials)=
6+
57
## Credentials
68

79
Credentials for submitting requests to the Google Cloud Batch API are picked up from your environment using [Application Default Credentials](https://github.com/googleapis/google-auth-library-java#google-auth-library-oauth2-http). Application Default Credentials are designed to use the credentials most natural to the environment in which a tool runs.

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ migrations/index
152152
153153
developer/index
154154
developer/diagram
155+
developer/config-scopes
155156
developer/packages
156157
developer/plugins
157158
```

docs/notifications.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -220,17 +220,26 @@ mail {
220220
}
221221
```
222222

223+
:::{note}
224+
Some versions of Java (e.g. Java 11 Corretto) do not default to TLS v1.2, and as a result may have issues with 3rd party integrations that enforce TLS v1.2 (e.g. Azure Active Directory OIDC). This problem can be addressed by setting the following config option:
225+
226+
```groovy
227+
mail {
228+
smtp.ssl.protocols = 'TLSv1.2'
229+
}
230+
```
231+
:::
232+
223233
See the {ref}`mail scope <config-mail>` section to learn more the mail server configuration options.
224234

225235
### AWS SES configuration
226236

227237
:::{versionadded} 23.06.0-edge
228238
:::
229239

230-
Nextflow supports [AWS SES](https://aws.amazon.com/ses/) native API as an alternative
231-
provider to send emails in place of SMTP server.
240+
Nextflow supports the [AWS Simple Email Service](https://aws.amazon.com/ses/) API as an alternative provider to send emails in place of an SMTP server.
232241

233-
To enable this feature add the following environment variable in the launching environment:
242+
To enable this feature, set the following environment variable in the launch environment:
234243

235244
```bash
236245
export NXF_ENABLE_AWS_SES=true
@@ -242,6 +251,20 @@ Make also sure to add the following AWS IAM permission to the AWS user (or role)
242251
ses:SendRawEmail
243252
```
244253

254+
The following snippet shows how to configure Nextflow to send emails through SES:
255+
256+
```groovy
257+
mail {
258+
smtp.host = 'email-smtp.us-east-1.amazonaws.com'
259+
smtp.port = 587
260+
smtp.user = '<Your AWS SES access key>'
261+
smtp.password = '<Your AWS SES secret key>'
262+
smtp.auth = true
263+
smtp.starttls.enable = true
264+
smtp.starttls.required = true
265+
}
266+
```
267+
245268
## Mail notification
246269

247270
You can use the `sendMail` function with a {ref}`workflow completion handler <metadata-completion-handler>` to notify the completion of a workflow completion. For example:

0 commit comments

Comments
 (0)