Skip to content

georgiibondarev/batch_reannotation_with_prokka

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

🧬 So, this is a simple little Prokka batch annotation wrapper. It needs .fna and .gbk files as input. It automatically extracts information from the .gbk file with BioPython and adds it to the Prokka input (taxonomy info) to properly annotate each genome. The script should give you summary statistics after the annotation.

It can be useful for batch re-annotation of microbial genomes after retrieval from a database (to unify annotation).

Minimal dependencies

  • Python 3.6+ (with Biopython)
  • Prokka (with all its sub-dependencies)

Citations

  • Cock, P. J., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., … others. (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11), 1422–1423.
  • Seemann T. Prokka: rapid prokaryotic genome annotation Bioinformatics 2014 Jul 15;30(14):2068-9. PMID:24642063 DOI:10.1093/bioinformatics/btu153 CFF

Usage

First, prepare your data: genome1.fna + genome1.gbk; genome2.fna + genome2.gbk etc., in one directory

To run the script:

python prokka_batch_annotator.py <input_dir> <output_dir> <threads> <kingdom>

Example:

python prokka_batch_annotator.py ./genomes ./prokka_results 8 bacteria

You can override taxonomy, for example:

python prokka_batch_annotator.py ./genomes ./output 4 bacteria \ --genus Arenibacter \ --species latericius

  • Original code: @georgiibondarev
  • Code comments formatted with assistance from Claude Sonnet 4.5 (Anthropic).
  • Code style: black

About

simple Prokka batch re-annotation wrapper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages