Skip to content

All fast5 files failed when perform call_mods #35

@yxlong-science

Description

@yxlong-science

Hi, very good software! Running multi_to_single_fast5, guppy_basecaller and tombo are all normal, but when running call_mods, it shows that all files fail. Whether it is the test data you provided or my own data. I have encountered this problem. The following is the code I run my own data and the corresponding log. I also set HDF5_PLUGIN_PATH as you said, but it doesn't work. For the test data, I also tried to unzip the file as issues #8, but the same problem still occurs. Thank you for your response.

data fractionation

multi_to_single_fast5 -i multi_read_fast5_dir -s fast5s/ -t 10 --recursive

guppy_basecaller

singularity exec /public/home/yxlong/Singularity/guppy-gpu.sif guppy_basecaller -i /public/home/yxlong/Modifications/HW04/fast5s/ -r -s fast5s_guppy --config dna_r9.4.1_450bps_hac_prom.cfg --device CUDA:0

config file: /opt/ont/guppy/data/dna_r9.4.1_450bps_hac_prom.cfg
model file: /opt/ont/guppy/data/template_r9.4.1_450bps_hac_prom.jsn
input path: /public/home/yxlong/Modifications/HW04/fast5s/
save path: fast5s_guppy
chunk size: 2000
chunks per runner: 1024
minimum qscore: 9
records per file: 4000
num basecallers: 4
gpu device: CUDA:0
kernel path:
runners per device: 20

Use of this software is permitted solely under the terms of the end user license agreement (EULA).
By running, copying or accessing this software, you are demonstrating your acceptance of the EULA.
The EULA may be found in /opt/ont/guppy/bin
Found 4000 input read files to process.
Init time: 1776 ms

0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|


Caller time: 20067 ms, Samples called: 87639365, samples/s: 4.36734e+06
Finishing up any open output files.
Basecalling completed successfully.

tombo

cat fast5s_guppy/*/*fastq > fast5s_guppy.fastq

micromamba run -n ont-tombo tombo preprocess annotate_raw_with_fastqs --fast5-basedir /public/home/yxlong/Modifications/HW04/fast5s/ --fastq-filenames fast5s_guppy.fastq --sequencing-summary-filenames /public/home/yxlong/Modifications/HW04/fast5s_guppy/sequencing_summary.txt --basecall-group Basecall_1D_000 --basecall-subgroup BaseCalled_template --overwrite --processes 10

[10:19:15] Getting read filenames.
[10:19:15] Parsing sequencing summary files.
[10:19:15] Annotating FAST5s with sequence from FASTQs.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [00:00<00:00, 7867.41it/s]
[10:19:15] Added sequences to a total of 4000 reads.

micromamba run -n ont-tombo tombo resquiggle /public/home/yxlong/Modifications/HW04/fast5s/ /public/home/jyli/HiFi_Genomes/03.AD1_Updated/HC04_V2/HC04_chr_adjust.fa --processes 10 --corrected-group RawGenomeCorrected_000 --basecall-group Basecall_1D_000 --overwrite

[10:21:21] Loading minimap2 reference.
[10:22:10] Getting file list.
[10:22:10] Loading default canonical ***** DNA ***** model.
[10:22:13] Re-squiggling reads (raw signal to genomic sequence alignment).
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4000/4000 [01:49<00:00, 36.43it/s]
[10:24:03] Final unsuccessful reads summary (2.6% reads unsuccessfully processed; 104 total reads):
1.4% ( 56 reads) : Poor raw to expected signal matching (revert with tombo filter clear_filters)
0.8% ( 31 reads) : Alignment not produced
0.4% ( 16 reads) : Read event to sequence alignment extends beyond bandwidth
0.0% ( 1 reads) : Fewer changepoints found than requested
[10:24:03] Saving Tombo reads index to file.

deepsignal_plant

micromamba run -n deepsignal deepsignal_plant call_mods --input_path /public/home/yxlong/Modifications/HW04/fast5s/ --model_path /public/home/yxlong/Modifications/example/model.dp2.CNN.arabnrice2-1_120m_R9.4plus_tem.bn13_sn16.both_bilstm.epoch6.ckpt --result_file fast5s.C.call_mods.tsv --corrected_group RawGenomeCorrected_000 --motifs C --nproc 10 --nproc_gpu 10

===============================================

parameters:

input_path:
/public/home/yxlong/Modifications/HW04/fast5s/
f5_batch_size:
30
model_path:
/public/home/yxlong/Modifications/example/model.dp2.CNN.arabnrice2-1_120m_R9.4plus_tem.bn13_sn16.both_bilstm.epoch6.ckpt
model_type:
both_bilstm
seq_len:
13
signal_len:
16
layernum1:
3
layernum2:
1
class_num:
2
dropout_rate:
0
n_vocab:
16
n_embed:
4
is_base:
yes
is_signallen:
yes
batch_size:
512
hid_rnn:
256
result_file:
fast5s.C.call_mods.tsv
gzip:
False
recursively:
yes
corrected_group:
RawGenomeCorrected_000
basecall_subgroup:
BaseCalled_template
is_dna:
yes
normalize_method:
mad
motifs:
C
mod_loc:
0
region:
None
positions:
None
reference_path:
None
nproc:
10
nproc_gpu:
10

===============================================

[main] call_mods starts..
cuda availability: False
4000 fast5 files in total..
parse the motifs string..
read_fast5 process-185516 starts
read_fast5 process-185518 starts
read_fast5 process-185517 starts
read_fast5 process-185519 starts
read_fast5 process-185520 starts
read_fast5 process-185521 starts
read_fast5 process-185522 starts
call_mods process-185523 starts
call_mods process-185524 starts
write_process-185525 starts
read_fast5 process-185516 ending, proceed 600 fast5s
read_fast5 process-185518 ending, proceed 570 fast5s
read_fast5 process-185522 ending, proceed 540 fast5s
read_fast5 process-185519 ending, proceed 510 fast5s
read_fast5 process-185521 ending, proceed 580 fast5s
read_fast5 process-185517 ending, proceed 600 fast5s
read_fast5 process-185520 ending, proceed 600 fast5s
call_mods process-185524 ending, proceed 0 feature-batches(512)
call_mods process-185523 ending, proceed 0 feature-batches(512)
write_process-185525 finished
4000 of 4000 fast5 files failed..

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions