You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`oma_source`| Selection of OMA data source. Can be either 'FastOMA' or 'Production'. The selection requires setting either the parameters for FastOMA or Production. |`string`| FastOMA ||
91
+
|`oma_version`| Version of the OMA Browser instance. It defaults to 'All.<Mon><YEAR>' |`string`||||
92
+
| `oma_release_char` | Release specific character (used in HOG ids) <details><summary>Help</summary><small>A single capital letter [A-Z] which makes the
93
+
HOG-IDs unique accross different releases.</small></details>| `string` | | | |
91
94
92
95
### FastOMA Input data
93
96
@@ -109,43 +112,95 @@ Input files genereated from an OMA Production run
109
112
|`matrix_file`| OMA Groups file |`string`|||
110
113
|`hog_orthoxml`| Hierarchcial orthologous groups (HOGs) in orthoxml format |`string`|| True |
| `infer_domains` | Flag indicating whether domains are inferred using the CATH/Gene3d pipeline. <details><summary>Help</summary><small>If set to true, the
124
+
pipeline will run the CATH/Gene3D pipeline to infer domain assignments. This will require substantial amount of compute time. The set of already known
125
+
domains (see parameter 'known_domains') will be used to skip the inference of domains that are already known. If set to false, the pipeline will use the
126
+
known domain assignments provided in the 'known_domains' parameter.</small></details>| `boolean` | | | |
127
+
| `known_domains` | Folder containing known domain assignments files. <details><summary>Help</summary><small>The folder must contain csv/tsv files that
128
+
contain three columns (md5hash of sequence, CATH-domain-id, region on sequence). The output of a previous run of this pipeline can thus be used as
|`xref_uniprot_swissprot`| UniProtKB/SwissProt annotation in text format |`string`|https://ftp.ebi.ac.uk/pub/databases/uniprot/knowledgebase/uniprot_sprot.dat.gz||
130
-
|`xref_uniprot_trembl`| UniProtKB/TrEMBL annotations in text format |`string`| /dev/null ||
| `xref_uniprot_trembl` | UniProtKB/TrEMBL annotations in text format. <details><summary>Help</summary><small>If not provided, no TrEMBL cross-references
150
+
will be included. The generic ftp url for TrEMBL is
| `taxonomy_sqlite_path` | Path to a sqlite database containing the combined NCBI/GTDB taxonomy data. <details><summary>Help</summary><small>If not provided
153
+
it will be generated automatically and cached</small></details>| `string` | | | |
154
+
| `xref_refseq` | 'download' or folder containing RefSeq gbff files. <details><summary>Help</summary><small>If not specified, no RefSeq crossreferences will
155
+
be download (default). If set to 'download', the latest RefSeq gbff files will be downloaded from NCBI FTP server. Alternatively, a folder containing local
156
+
*.gbff.gz files can be provided.</small></details>| `string` | | | |
|`go_gaf`| Gene Ontology annotations (GAF format). This can the GOA database or a glob pattern with local files in gaf format. |`string`|https://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/goa_uniprot_all.gaf.gz||
| `omamer_levels` | Comma-seperated list of taxonomic levels for which OMAmer databases should be built. <details><summary>Help</summary><small>The input
175
+
string is parsed as a comma-seperated list, e.g. given 'Mammalia,Primates' as parameter value would build two OMAmer databases, one for Mammalia and one for
176
+
Primates. Note that the taxonomic levels must exist in the input species tree.</small></details>| `string` | | | |
| `rdf_export` | Flag to activate export as RDF triples <details><summary>Help</summary><small>Activating rdf_export will enable the dump of RDF ttl files
185
+
which can be imported into a Sparql endpoint.</small></details>| `boolean` | | | |
186
+
|`rdf_orthOntology`| user provided orthOntology file. If not provided, default ontology will be used |`string`||||
187
+
|`rdf_prefixes`| user provided rdf prefix mapping. if not provided, default prefixes will be used. |`string`||||
188
+
189
+
### Production OMA output settings
190
+
191
+
Parameters concerning additional output files usually needed for the production OMA Browser instance
| `oma_dumps` | Flag to activate dumping various files for the download section <details><summary>Help</summary><small>Activating oma_dumps will enable
196
+
species, sequences, GO annotations files as text files for the download section.</small></details>| `boolean` | | | |
142
197
143
198
### Generic options
144
199
145
200
Less common options for the pipeline, typically set in a config file.
0 commit comments