Skip to content

Commit 9275472

Browse files
add Psortb (#1735)
* add Psortb * Update tools/psortb/psortb.xml Co-authored-by: Björn Grüning <bjoern@gruenings.eu> * Update tools/psortb/.shed.yml Co-authored-by: Björn Grüning <bjoern@gruenings.eu> * Update remote repository URL for psortb tool * Update remote repository URL to use 'master' branch --------- Co-authored-by: Björn Grüning <bjoern@gruenings.eu>
1 parent 749acab commit 9275472

File tree

6 files changed

+158
-0
lines changed

6 files changed

+158
-0
lines changed

tools/psortb/.shed.yml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
name: psortb
2+
owner: bgruening
3+
description: PSORTb — protein subcellular localization prediction for prokaryotes
4+
long_description: |
5+
Galaxy wrapper for PSORTb, a tool that predicts the subcellular
6+
localization of bacterial and archaeal proteins. The tool supports
7+
Gram-positive, Gram-negative, and Archaea models.
8+
9+
homepage_url: https://psort.org/
10+
remote_repository_url: https://github.com/bgruening/galaxytools/tree/master/tools/psortb
11+
type: unrestricted
12+
categories:
13+
- Proteomics
14+
- Sequence Analysis
15+

tools/psortb/psortb.xml

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
<tool id="psortb" name="PSORTb" version="3.0.6" profile="24.0">
2+
<description>Protein subcellular localization prediction for prokaryotes</description>
3+
<xrefs>
4+
<xref type="bio.tools">psortb</xref>
5+
</xrefs>
6+
<requirements>
7+
<container type="docker">quay.io/galaxy/psortb-cli:3.0.6</container>
8+
</requirements>
9+
<command detect_errors="exit_code">
10+
<![CDATA[
11+
mkdir \$TMPDIR/results &&
12+
/usr/local/psortb/bin/psort
13+
$gram.gram_choice
14+
#if $cutoff
15+
-c $cutoff
16+
#end if
17+
#if $divergent
18+
-d $divergent
19+
#end if
20+
-f fasta
21+
$exact
22+
-o $output_format
23+
-i $input_fasta
24+
&&
25+
mv \$TMPDIR/results/*_psortb_*.txt $output
26+
]]>
27+
</command>
28+
29+
<inputs>
30+
<param name="input_fasta" type="data" format="fasta" label="Protein sequences (FASTA)" help="Submit protein sequences in FASTA format."/>
31+
32+
<section name="gram" title="Organism classification" expanded="true">
33+
<param name="gram_choice" type="select" label="Organism type">
34+
<option value="--positive" selected="true">Gram-positive (Bacteria)</option>
35+
<option value="--negative">Gram-negative (Bacteria)</option>
36+
<option value="--archaea">Archaea</option>
37+
</param>
38+
</section>
39+
40+
<param name="output_format" type="select" label="Output format" help="Choose PSORTb output format.">
41+
<option value="normal">Normal (human-readable)</option>
42+
<option value="terse">3-column (terse)</option>
43+
<option value="long" selected="true">30-column (long)</option>
44+
</param>
45+
<param argument="--exact" type="boolean" truevalue="--exact" falsevalue="" checked="false" label="Skip SCLBLASTe" help="Useful for batch runs of data against itself in SCLBLAST"/>
46+
<param argument="--cutoff" type="float" optional="true" label="Prediction cutoff" help="Sets a cutoff value for reported results (default: 7.5 used internally)."/>
47+
<param argument="--divergent" type="float" optional="true" label="Multiple localization cutoff" help="Sets a cutoff for flagging potential multiple localization sites."/>
48+
</inputs>
49+
50+
<outputs>
51+
<data name="output" format="txt" label="PSORTb results on ${on_string}"/>
52+
</outputs>
53+
54+
<tests>
55+
<test expect_num_outputs="1">
56+
<param name="input_fasta" value="psortb_pos.fa"/>
57+
<param name="gram|gram_choice" value="--positive"/>
58+
<param name="output_format" value="normal"/>
59+
<output name="output" value="psortb_pos_output.txt"/>
60+
</test>
61+
<test expect_num_outputs="1">
62+
<param name="input_fasta" value="psortb_neg.fa"/>
63+
<param name="gram|gram_choice" value="--negative"/>
64+
<param name="output_format" value="terse"/>
65+
<output name="output">
66+
<assert_contents>
67+
<has_text_matching expression="SeqID\tLocalization\tScore" />
68+
<has_text_matching expression="NP_949347\.1 \tUnknown\t7\.0" />
69+
</assert_contents>
70+
</output>
71+
</test>
72+
<test expect_num_outputs="1">
73+
<param name="input_fasta" value="psortb_arch.fa"/>
74+
<param name="gram|gram_choice" value="--archaea"/>
75+
<param name="output_format" value="long"/>
76+
<output name="output">
77+
<assert_contents>
78+
<has_text_matching expression="SeqID\s+CMSVM_a_Localization\s+CMSVM_a_Details\s+CWSVM_a_Localization\s+CWSVM_a_Details\s+CytoSVM_a_Localization\s+CytoSVM_a_Details\s+ECSVM_a_Localization\s+ECSVM_a_Details\s+ModHMM_a_Localization\s+ModHMM_a_Details\s+Motif_a_Localization\s+Motif_a_Details\s+Profile_a_Localization\s+Profile_a_Details\s+SCL-BLAST_a_Localization\s+SCL-BLAST_a_Details\s+SCL-BLASTe_a_Localization\s+SCL-BLASTe_a_Details\s+Signal_a_Localization\s+Signal_a_Details\s+Cytoplasmic_Score\s+CytoplasmicMembrane_Score\s+Cellwall_Score\s+Extracellular_Score\s+Final_Localization\s+Final_Localization_Details\s+Final_Score\s+Secondary_Localization\s+PSortb_Version" />
79+
<has_text_matching expression="YP_001689002\.1\s+Unknown\s+Unknown\s+Unknown\s+Extracellular\s+Unknown\s+1 internal helix found\s+Unknown\s+No motifs found\s+Unknown\s+No matches to profiles found\s+Extracellular\s+matched 47117675: Flagellin B1 precursor\s+Unknown\s+No matches against database\s+Unknown\s+No signal peptide detected\s+0.01\s+0.00\s+0.02\s+9.97\s+Extracellular\s+9.97\s+Flagellar\s+PSORTb version" />
80+
</assert_contents>
81+
</output>
82+
</test>
83+
</tests>
84+
85+
<help>
86+
<![CDATA[
87+
PSORTb predicts the subcellular localization of bacterial and archaeal proteins.
88+
89+
Input requirements
90+
91+
- Protein sequences in FASTA format. All sequences in one run should belong to the same organism class.
92+
93+
Options
94+
95+
- Organism type: select Gram-positive (`--positive`), Gram-negative (`--negative`), or Archaea (`--archaea`).
96+
- Output format: `normal` (human-readable), `terse` (tab-delimited), or `long` (tab-delimited with module details).
97+
- Cutoff (`-c`): threshold for final localization assignment (documentation suggests ~7.5).
98+
- Multiple localization cutoff (`-d`): threshold to flag possible multiple localization sites.
99+
- Exact (`--exact`): skip SCL-BLASTe step.
100+
101+
Notes
102+
103+
- PSORTb emphasizes precision; proteins with ambiguous signals may be reported as Unknown.
104+
- Long format includes module outputs and localization scores; terse/long are suitable for bulk processing.
105+
106+
Reference
107+
108+
https://psort.org/documentation/index.html
109+
]]>
110+
</help>
111+
112+
<citations>
113+
<citation type="doi">10.1093/bioinformatics/btq249</citation>
114+
</citations>
115+
</tool>
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
>YP_001689002.1
2+
MFEFITDEDERGQVGIGTLIVFIAMVLVAAIAAGVLINTAGYLQSKGSATGEEASAQVSNRINIVSAYGNVNNEKVDYVNLTVRQAAGADNINLTKSTIQWIGPDRATTLTYSSNSPSSLGENFTTESIKGSSADVLVDQSDRIKVIMYASGVSSNLGAGDEVQLTVTTQYGSKTTYWAQVPESLKDKNA
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
>NP_949347.1
2+
MQGHHFGGDMSNSEAIDNTTAKLRLAQSSSLLALALLIGSAPAQAADTDWGWLAIGAPAATAQGWTGKGVVIGVVDTGIDFSHPALSGRAFDYNYGSFVAGSNHPHATHVAGIIGATDINRGMEGVAPDVRFSSMKIFTGAGGSYLGDAAVADAYDGAIGSGVRIFNNSWGSSDSIANFTSREELLAHEPLLVGAFTRAVNADAVLVWSTGNDGRSQPSWQAAAPYYIQELKANWIAVTSVGENGTIASYANACGVAKAWCLAAPGGDFNPGIYSTIPGKDYGYMSGTSMAAPYVTGATAIARQMFPKASGAQLAQIVLQTSRDIGAPGIDDVYGWGLLAVDNIVDTINPRGAALFASAAWGRFTTLSAIGNTVLDRISDLRNGRGDVVTAPLAFAG
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
>SAK_BPP42
2+
MLKRSLLFLTVLLLLFSFSSITNEVSASSSFDKGKYKKGDDASYFEPTGPYLMVNVTGVDGKRNELLSPRYVEFPIKPGTTLTKEKIEYYVEWALDATAYKEFRVVELDPSAKIEVTYYDKNKKKEETKSFPITEKGFVVPDLSEHIKNPGFNLITKVVIEKK
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
SeqID: SAK_BPP42
2+
Analysis Report:
3+
CMSVM+ Unknown [No details]
4+
CWSVM+ Unknown [No details]
5+
CytoSVM+ Unknown [No details]
6+
ECSVM+ Extracellular [No details]
7+
ModHMM+ Unknown [1 internal helix found]
8+
Motif+ Unknown [No motifs found]
9+
Profile+ Unknown [No matches to profiles found]
10+
SCL-BLAST+ Extracellular [matched 134189: Staphylokinase precursor (Neutral proteinase) (Protease III)]
11+
SCL-BLASTe+ Unknown [No matches against database]
12+
Signal+ Non-Cytoplasmic [Signal peptide detected]
13+
Localization Scores:
14+
Extracellular 9.98
15+
Cellwall 0.02
16+
CytoplasmicMembrane 0.00
17+
Cytoplasmic 0.00
18+
Final Prediction:
19+
Extracellular 9.98
20+
21+
-------------------------------------------------------------------------------
22+

0 commit comments

Comments
 (0)