-
Notifications
You must be signed in to change notification settings - Fork 95
Expand file tree
/
Copy pathgi2taxonomy.xml
More file actions
105 lines (80 loc) · 4.04 KB
/
gi2taxonomy.xml
File metadata and controls
105 lines (80 loc) · 4.04 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
<tool id="Fetch Taxonomic Ranks" name="Fetch taxonomic representation" version="1.1.1">
<description></description>
<requirements>
<requirement type="package" version="1.0.0">taxonomy</requirement>
</requirements>
<xrefs>
<xref type="bio.tools">gi2taxonomy</xref>
</xrefs>
<command interpreter="python">gi2taxonomy.py $input $giField $idField $out_file1 ${GALAXY_DATA_INDEX_DIR}</command>
<inputs>
<param format="tabular" name="input" type="data" label="Show taxonomic representation for"></param>
<param name="giField" label="GIs column" type="data_column" data_ref="input" numerical="True" help="select column containing GI numbers"/>
<param name="idField" label="Name column" type="data_column" data_ref="input" help="select column containing identifiers you want to include into output"/>
</inputs>
<outputs>
<data format="taxonomy" name="out_file1" />
</outputs>
<requirements>
<requirement type="binary">taxBuilder</requirement>
</requirements>
<tests>
<test>
<param name="input" ftype="tabular" value="taxonomy2gi-input.tabular"/>
<param name="giField" value="1"/>
<param name="idField" value="2"/>
<output name="out_file1" file="taxonomy2gi-output.tabular"/>
</test>
</tests>
<help>
.. class:: infomark
Use *Filter and Sort->Filter* to restrict output of this tool to desired taxonomic ranks. You can also use *Text Manipulation->Cut* to remove unwanted columns from the output.
------
**What it does**
Fetches taxonomic information for a list of GI numbers (sequences identifiers used by the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov).
-------
**Example**
Suppose you have BLAST output that looks like this::
+-----------------------+----------+----------+-----------------+------------+------+--------+
| queryId | targetGI | identity | alignmentLength | mismatches | gaps | score |
+-----------------------+----------+----------+-----------------+------------+------+--------+
| 1L_EYKX4VC01BXWX1_265 | 1430919 | 90.09 | 212 | 15 | 6 | 252.00 |
+-----------------------+----------+----------+-----------------+------------+------+--------+
and you want to obtain full taxonomic representation for GIs listed in *targetGI* column. If you set parameters as shown here:
.. image:: ${static_path}/images/fetchTax.png
the tool will generate the following output (you may need to scroll sideways to see the entire line)::
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1L_EYKX4VC01BXWX1_265 9606 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n 1430919
In other words the tool printed *Name column*, *taxonomy Id*, appended 22 columns containing taxonomic ranks from Superkingdom to Subspecies and added *GI* as the last column. Below is a formal definition of the output columns::
Column Definition
------- ------------------------------------------
1 Name (specified by 'Name column' dropdown)
2 GI (specified by 'GI column' dropdown)
3 root
4 superkingdom
5 kingdom
6 subkingdom
7 superphylum
8 phylum
9 subphylum
10 superclass
11 class
12 subclass
13 superorder
14 order
15 suborder
16 superfamily
17 family
18 subfamily
19 tribe
20 subtribe
21 genus
22 subgenus
23 species
24 subspecies
------
.. class:: warningmark
**Why do I have these "n" things?**
Be aware that the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) this tool relies upon is incomplete. This means that for many species one or more ranks are absent and represented as "**n**". In the above example *subkingdom*, *superphylum* etc. are missing.
</help>
</tool>