A deep learning approach that identifies the inducible activity of prophages from their DNA sequences.
System and software requirements:
- Linux
- Python 3.x (Any Python version compatible with PyTorch,Tested with Python 3.12.)
- biopython
- numpy
- pytorch (If you want to enable GPU acceleration, please install the appropriate GPU-enabled PyTorch version from the official PyTorch website.)
1. You only need to download iProphIT-classifier.py and iProphIT_model-v1.pth into your working directory.
(iProphIT_model-v1.pth website: (https://doi.org/10.5281/zenodo.17605580)
2. Create a conda environment and install required packages:
conda create -n iprophit python=3.12
conda activate iprophit
conda install -c conda-forge biopython numpy
conda install pytorch
1. Download iProphIT-classifier.py, iProphIT_model-v1.pth and put them in your working path.
2. Run iProphIT-classifier.py
python iProphIT-classifier.py -i test_iProphIT.fasta -m iProphIT_model-v1.pth -o ./Result.tsv -t 16
usage: iProphIT-classifier.py [-h] -i INPUT [-m MODEL] [-o OUTPUT] [-t THREADS] [-b BATCH_SIZE]
options:
-h, --help show this help message and exit
-i INPUT, --input INPUT
Path to the input FASTA file (required)
-m MODEL, --model MODEL
Path to the trained model file (default: ./iProphIT_model-v1.pth)
-o OUTPUT, --output OUTPUT
Output TSV file path (default: ./Result.tsv)
-t THREADS, --threads THREADS
Number of CPU threads for DataLoader (default: 4)
-b BATCH_SIZE, --batch_size BATCH_SIZE
Batch size for prediction (default: 4). Larger values accelerate inference.
GPU: Increase for speedup until CUDA OOM, then reduce.
CPU: Can use larger values due to more RAM.
-
Find result in Result.tsv
ID Predict Confidence prophage1 active 0.9981 prophage2 dormant 0.9221 -
Explanation
1.IDis the content of the description line in the genome file.
2.Predictis the result of identification (active->inducible prophage,dormant->non-inducible prophage).
genome file: OY731326.1 and OY731419.1,
source: Dahlman S. et al., Nature (2025), https://doi.org/10.1038/s41586-025-09614-7
- Run
iProphIT-classifier.py
python iProphIT-classifier.py -i test_iProphIT.fasta -m iProphIT_model-v1.pth -o ./Result.tsv -t 16
- Output of the test
ID Predict Confidence
OY731326.1 active 0.9989
OY731419.1 active 0.9927
- Input can accept genome files in formats such as
.fasta,.fa,.fna, etc. - iProphIT will automatically use the GPU if available, as long as you have installed a
PyTorchversion with CUDA support.
Hongbo Zhang, Chen Liu, Hanpeng Liao, Fujian Provincial Key Laboratory of Soil Environmental Health and Regulation, College of Resources and Environment, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.
