Skip to content

Commit 88bee42

Browse files
committed
bug fix
- I do not know anymore in which version, but in some 0.9.x dorado version, the shift and scale values were taken from the raw DACS values instead of the pA signal - First I thought this was a change between the old r9 pore to the new r10 and rna004 flowcells, but I noticed, that the sm and sd values in the basecalled bam changed back to normalize the pA signal. Now I just check if the shift is above 400, then the raw DACS signal is taken for normalization, otherwise the pA signal is taken.
1 parent 6862019 commit 88bee42

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

src/python/segmentation/segment.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -120,15 +120,22 @@ def asyncSegmentation(q : mp.Queue, script : str, modelpath : str, pore : str, r
120120
None
121121
"""
122122
r5 = read5_ont.read(rawFile)
123-
if pore in ["dna_r9", "rna_r9"]:
123+
if "r9" in pore: # in ["dna_r9", "rna_r9"]:
124124
# for r9 pores, shift and scale are stored for pA signal in bam
125-
signal = r5.getpASignal(signalid)[start:end]
125+
# signal = r5.getpASignal(signalid)[start:end]
126126
kmerSize = 5
127127
else:
128128
# for new pores, shift and scale is directly applied to stored integer signal (DACs)
129129
# this way the conversion from DACs to pA is skipped
130-
signal = r5.getSignal(signalid)[start:end]
130+
# signal = r5.getSignal(signalid)[start:end]
131131
kmerSize = 9
132+
133+
#! I do not know anymore in which version, but in some 0.9.x dorado version, the shift and scale values were taken from the raw DACS values instead of the pA signal
134+
if shift > 400:
135+
signal = r5.getSignal(signalid)[start:end]
136+
else:
137+
signal = r5.getpASignal(signalid)[start:end]
138+
132139
r5.close()
133140

134141
#! normalize poly A region to median 0.9 (as in init models from ONT r9 and rp4) and scale to 0.15 (from training on r9 and rp4)

0 commit comments

Comments
 (0)