You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -63,7 +63,6 @@ Example configuration for SARS-CoV-2:
63
63
```json
64
64
{
65
65
"qc": {
66
-
"schemaVersion": "1.2.0",
67
66
"privateMutations": {
68
67
"enabled": true,
69
68
"typical": 8,
@@ -131,8 +130,10 @@ Example:
131
130
132
131
```json
133
132
{
134
-
"cli": "3.0.0",
135
-
"web": "3.0.0"
133
+
"compatibility": {
134
+
"cli": "3.0.0",
135
+
"web": "3.0.0"
136
+
}
136
137
}
137
138
```
138
139
@@ -142,7 +143,7 @@ Optional `str`. The default gene/CDS to be shown in Nextclade web. If not provid
142
143
143
144
#### `cdsOrderPreference`
144
145
145
-
Optional `array[str]`. Order in which genes are shown in Nextclade web dropdown. Example value ["S", "ORF1a", "N", "E"]
146
+
Optional `array[str]`. Order in which genes are shown in Nextclade web dropdown. Example value `["S", "ORF1a", "N", "E"]`
146
147
147
148
#### `generalParams`
148
149
@@ -158,25 +159,114 @@ Optional `dict`. Parameters for the alignment algorithm. These are identical to
158
159
159
160
#### `treeBuilderParams`
160
161
161
-
Optional `dict`. Parameters for the tree building algorithm. These are identical to the corresponding CLI arguments (though here _camelCase_ needs to be used. If not provided, default values are used.
162
+
Optional `dict`. Parameters for the tree building algorithm. These are identical to the corresponding CLI arguments (though here _camelCase_ needs to be used). If not provided, default values are used.
162
163
163
164
-`withoutGreedyTreeBuilder`: If you don't want to use the greedy tree builder, set this to `true`. Default: `false`.
164
165
-`maskedMutsWeight`: Parsimony weight for masked mutations. Default: `0.05`.
165
166
166
-
#### `primers`
167
167
168
-
TODO
168
+
#### Calculate phenotypic scores from mutations (`phenotypeData`)
169
169
170
-
#### `phenotypeData`
170
+
Nextclade can calculate numerical scores derived from mutations in a query sequence relative to the reference sequence.
171
+
Such scores could for example be used to calculate predicted ACE2 binding for SARS-CoV-2, immune escape estimates, or potential drug resistance. To specify such numerical scores, the field `phenotypeData` needs to be added to the `pathogen.json`.
171
172
172
-
TODO
173
+
Each such score is based on exactly one CDS and each amino acid mutation can be assigned a specific contribution to the score.
174
+
In addition, a "default" value can be specified for amino acid mutations that are not explicitly listed.
175
+
```json
176
+
{
177
+
"phenotypeData": [
178
+
{
179
+
"aaRange": {
180
+
"begin": 330,
181
+
"end": 531
182
+
},
183
+
"description": "Estimated ACE2 binding",
184
+
"cds": "S",
185
+
"ignore": {
186
+
"clades": ["outgroup"]
187
+
},
188
+
"name": "ace2_binding",
189
+
"nameFriendly": "ACE2 binding",
190
+
"data": [
191
+
{
192
+
"name": "binding",
193
+
"weight": 1.0,
194
+
"locations": {
195
+
"330": {
196
+
"default": 0.1,
197
+
"A": -0.08339,
198
+
"C": -0.61624,
199
+
"D": -0.1467,
200
+
"E": -0.14146,
201
+
...
202
+
},
203
+
"331": {}
204
+
...
205
+
}
206
+
}
207
+
]
208
+
}
209
+
]
210
+
}
211
+
```
212
+
If the score is only relevant for specific clades, you can specify which clades are to be ignored.
213
+
214
+
#### Amino acid motif detection (`aaMotifs`)
215
+
216
+
Nextclade can detect and report specific motifs in translated amino acid sequences. This feature is currently being used to highlight changes in glycosylation or cleavage sites, but the feature itself is generic.
217
+
To use this feature, you need to add a `aaMotifs` field to the `pathogen.json`.
173
218
174
-
#### `aaMotifs`
219
+
Amino acid motifs can be specified using regular expressions and the parts of the genome in which Nextclade searches for the motifs is specified by listing the CDS and (optional) ranges within these CDSs (e.g.~to restrict to the exposed part of a protein).
220
+
An example of a full configuration (for glycosylation in influenza HA) is shown below.
221
+
```json
222
+
"aaMotifs": [
223
+
{
224
+
"name": "glycosylation",
225
+
"nameShort": "Glyc.",
226
+
"nameFriendly": "Glycosylation",
227
+
"description": "N-linked glycosylation motifs (N-X-S/T with X any amino acid other than P)",
228
+
"includeCdses": [
229
+
{
230
+
"cds":"HA1",
231
+
"ranges":[]
232
+
},
233
+
{
234
+
"cds":"HA2",
235
+
"ranges":[{"begin":0, "end":186}]
236
+
}
237
+
],
238
+
"motifs": [
239
+
"N[^P][ST]"
240
+
]
241
+
}
242
+
]
243
+
```
244
+
In the web interface, motifs are reported as shown in the screenshot below:
245
+

246
+
247
+
#### Labelling mutations of interest (`mutLabels`)
248
+
249
+
Nextclade can highlight specific mutations to the user, for example mutations that are indicative of contamination, drug resistance, or otherwise of particular interest.
250
+
To do so, you can specify mutations as "labeled" using the `mutLabels` field in the `pathogenJson`.
251
+
Labeled mutations are only searched among the "private" mutations, i.e. mutations in query sequences that are not found in the part of the reference tree the query sequence attaches to.
252
+
253
+
The json specification looks as follows
254
+
```json
255
+
{
256
+
"mutLabels": {
257
+
"nucMutLabelMap": {
258
+
"174T": ["20H", ...],
259
+
"204T": ["20E"],
260
+
...
261
+
}
262
+
}
263
+
}
264
+
```
265
+
Labeled "private" mutations are shown in the tool-tip of the mutation column when mutations "relative to parent" are shown (private mutations) and exported into the tabular output.
175
266
176
-
TODO
267
+
TODO: add amino acid mutations once released.
177
268
178
-
#### `mutLabels`
269
+
> ⚠️ Note that the specification of these mutations breaks with the convention of zero-indexing. Instead, these labeled mutations are one-indexed and directly correspond to the mutations displayed in the UI or in the tables.
179
270
180
-
TODO
181
271
182
272
> 💡 Nextclade CLI supports file compression and reading from standard input. See section [Compression, stdin](./compression.md) for more details.
0 commit comments