-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Dear Bambu Developers,
I am currently using bambu to analyze ONT direct RNA sequencing data from a leafy vegetable species. My dataset comprises 9 samples, each with 3 biological replicates. Unlike Arabidopsis, this species is not well-annotated, though both its genome and GTF annotation have been published.
Using bambu, I have generated an extended annotation (GTF). While the tool performs well in identifying novel genes and transcripts, I noticed that many existing gene models are not extended at their UTR regions, despite strong read support (hundreds of reads across multiple libraries). I have attached IGV screenshots illustrating such cases.
In an attempt to improve this, I experimented with different parameter settings in the bambu() function, but the 3' UTRs of many annotated genes remain truncated. Below is my latest command (NDR = 0.378):
se <- bambu(
reads = bam_files,
annotations = bambuAnnotations,
genome = fa.file,
ncore = 10,
discovery = TRUE,
quant = TRUE,
opt.discovery = list(min.sampleNumber = 2, min.readCount = 5)
)
Is there a way to allow bambu to extend known transcripts—particularly in the UTR regions—when there is strong read support? I am especially interested in improving annotation of known genes rather than just discovering novel ones.
Any suggestions or recommendations would be greatly appreciated.

