Skip to content

Duplicate Transcript IDs #161

@jchariker

Description

@jchariker

Hi,
I used the Isoseq workflow through the collapse step to identify individual transcripts within multiple samples. The Isoseq transcripts have different PacBio IDs across samples for the same transcript. To overcome this, I created MD tagged SAM files from the collapsed fasta files, input those to a Talon database, and annotated the transcripts with talon. I am seeing duplicate transcript IDs for different PacBio IDs. Isoseq is distinguishing transcripts by internal exon boundaries as well as different UTRs whereas Talon seems to be annotating the transcript only by internal exon boundaries. I attached a picture to illustrate several examples within the same gene. Is there a way to get Talon to consider different UTR lengths? Perhaps you have changed this in a recent version but I cannot seem to find the version we use. We installed in March 2022. It doesn't have an option to check the version (--version).
Thanks for any information you can provide.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions