Skip to content

Possibility to extend known gene models by adding UTRs #500

@Annie-GW

Description

@Annie-GW

Dear Bambu Developers,
I am currently using bambu to analyze ONT direct RNA sequencing data from a leafy vegetable species. My dataset comprises 9 samples, each with 3 biological replicates. Unlike Arabidopsis, this species is not well-annotated, though both its genome and GTF annotation have been published.

Using bambu, I have generated an extended annotation (GTF). While the tool performs well in identifying novel genes and transcripts, I noticed that many existing gene models are not extended at their UTR regions, despite strong read support (hundreds of reads across multiple libraries). I have attached IGV screenshots illustrating such cases.

In an attempt to improve this, I experimented with different parameter settings in the bambu() function, but the 3' UTRs of many annotated genes remain truncated. Below is my latest command (NDR = 0.378):

Image
Image

se <- bambu(
reads = bam_files,
annotations = bambuAnnotations,
genome = fa.file,
ncore = 10,
discovery = TRUE,
quant = TRUE,
opt.discovery = list(min.sampleNumber = 2, min.readCount = 5)
)

Is there a way to allow bambu to extend known transcripts—particularly in the UTR regions—when there is strong read support? I am especially interested in improving annotation of known genes rather than just discovering novel ones.

Any suggestions or recommendations would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions