-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Labels
Status: In ProgressHas been assigned and is being worked on.Has been assigned and is being worked on.
Description
Dears
I run:
chewBBACA.py AlleleCall -i ./data/raw/fasta -g ./gene_schema -o ./data/processed/gene --cpu 14
on 6285 assembly genomes of Klebsiella.
And I got this:
CDS prediction
================
Predicting CDSs for 6285 inputs...
[===== ] 28% 27%
Error on predict_genome_genes:
Traceback (most recent call last):
File "/home/user/anaconda3/envs/chewie/lib/python3.11/site-packages/CHEWBBACA/utils/multiprocessing_operations.py", line 42, in function_helper
results = input_args[-1](*input_args[0:-1])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/anaconda3/envs/chewie/lib/python3.11/site-packages/CHEWBBACA/utils/gene_prediction.py", line 217, in predict_genome_genes
current_gene_finder = train_gene_finder(current_gene_finder,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/anaconda3/envs/chewie/lib/python3.11/site-packages/CHEWBBACA/utils/gene_prediction.py", line 75, in train_gene_finder
gene_finder.train(*sequences, translation_table=translation_table)
File "pyrodigal/lib.pyx", line 5528, in pyrodigal.lib.GeneFinder.train
ValueError: sequence must be at least 20000 characters (17046 found)
Shall I pre-process the fasta files and filter out contigs <= 20000 characters ?
Bests,
Alex
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Status: In ProgressHas been assigned and is being worked on.Has been assigned and is being worked on.