You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+17-14Lines changed: 17 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -365,12 +365,14 @@ rowData(se[[1]])
365
365
|chr.rc|The chromosome name the read class is found on|
366
366
|strand.rc|The strand of the read class|
367
367
|startSD|The standard deviation of the aligned genomic start positions of all reads assigned to the read class|
368
+
|endSD|The standard deviation of the aligned genomic end positions of all reads assigned to the read class|
368
369
|readCount.posStrand|The number of reads assigned to this read class that aligned to the positive strand|
369
370
|intronStarts|A comma separated character vector of intron start coordinates|
370
371
|intronEnds|A comma separated character vector of intron end coordinates|
371
372
|confidenceType|Category of confidence: <br/> **highConfidenceJunctionReads** - the read class contain no low confidence junctions <br/> **lowConfidenceJunctionReads** - the read class contains low confidence junctions <br/> **unsplicedWithin** - single exon read class that is within the exon boundaries of an annotation <br/> **unsplicedNew** - single exon read class that does not fully overlap with annotated exons|
372
373
|readCount|The number of reads assigned to this read class|
373
-
|readId *only present when trackReads = TRUE|An integer list of bambu internal read ids that belong to the read class. (See the metadata of the object for full read names)|
374
+
|readIds|An integer list of bambu internal read ids that belong to the read class. (See the metadata of the object for full read names)|
375
+
|sampleIds|An integer list of bambu internal sample ids based on barcodes.|
374
376
|GENEID|The gene ID the transcript is associated with|
375
377
|novelGene|A logical that is true if the read class belongs to a novel gene (does not overlap with an annotated gene loci)|
376
378
|numExons|The number of exons the read class has|
@@ -382,8 +384,8 @@ rowData(se[[1]])
382
384
|numAend|An integer counting the number of A nucleotides found within a 20bp window centered on the read class genomic end position|
383
385
|numTstart|An integer counting the number of T nucleotides found within a 20bp window centered on the read class genomic start position|
384
386
|numTend|An integer counting the number of T nucleotides found within a 20bp window centered on the read class genomic end position|
385
-
|txScore|This is the TPS generated by the sample trained model|
386
387
|txScore.noFit|This is the TPS generated by the pretrained model|
388
+
|txScore|This is the TPS generated by the sample trained model|
387
389
388
390
389
391
### Tracking read-to-transcript assignment
@@ -476,30 +478,30 @@ If you want to run Bambu-Clump for single-cell or spatial analysis stand alone a
476
478
477
479
#### Read Class Construction:
478
480
479
-
**reads**: provided bam files must have barcodes in the read name or in the BC tag. Alternatively a csv file can be provided to demultiplexed mapping the read names to barcodes. For exact requirements see https://github.com/GoekeLab/bambu-singlecell-spatial.<br/>
481
+
**reads**: provided bam files should have barcodes in the read name or in the BC tag ( and UG tag for UMI identifiers). In the case where both tags and read names contain barcode information, tags will be used a prior. If not, a regular delimited headerless file that contain the demultiplexing information for each read should be provided to demultiplexed argument below. For exact requirements see https://github.com/GoekeLab/bambu-singlecell-spatial.<br/>
480
482
481
-
**demultiplexed**: must be set to TRUE (or be a barcode map). This will cause bambu to look for barcodes and seperate reads by barcode rather than sample. <br/>
483
+
**demultiplexed**: should be either set to TRUE or the path to barcode mapping file. Otherwise, bambu will not look for barcodes and seperate reads by barcode rather than sample. <br/>
482
484
483
485
Optional:
484
486
485
487
**cleanReads**: A logical TRUE/FALSE. Chimeric reads in samples can cause issues with barcode assignments. Setting this to TRUE will ensure only the first alignment per barcode is used (We recommend using this). <br/>
486
488
487
489
**sampleNames**: A vector of characters assigning names to each sample in the reads argument. By default the sample names are taken from the file names and appended to the barcodes in order to differentiate them. If your sample names are the same across multiple files, but matching barcodes between the samples should be counted seperately, provide them with different sample names using this argument. Similiarly if your samples have different names, but overlapping barcodes should be counted together, give them the same sample name with this argument. <br/>
488
490
489
-
**dedupUMI**: A logical TRUE/FALSE. <br/>
491
+
**dedupUMI**: A logical TRUE/FALSE. <br/>
490
492
491
493
**barcodesToFilter**: A string vector indicating barcodes to be filtered out. <br/>
Transript discovery can be run as usual as typically bulk-level discovery is suitable. However cluster-level transcript discovery can be preformed using the clusters argument which can be redone done after clustering.
|TXNAME|The transcript name for the transcript. Will use either the transcript name from the provided annotations or tx.X if it is a novel transcript where X is a unique integer.|
643
645
|GENEID|The gene name for the transcript. Will use either the gene name from the provided annotations or gene.X if it is a novel transcript where X is a unique integer.|
644
-
|eqClass|A character vector with the transcript names of all the equivalent transcripts (those which have this transcripts contiguous exon junctions)|
645
-
|txId|A bambu specific transcript id used for indexing purposes
646
-
|eqClassById|A integer list with the transcript ids of all equivalent transcripts
646
+
|NDR|The NDR score calculated for the transcript|
647
+
|novelGene|A logical variable that is true if transcript model is from a novel gene (does not overlap with an annotated gene loci)|
648
+
|novelTranscript|A logical variable that is true if transcript model is novel (passing NDR threshold)|
647
649
|txClassDescription|A concatenated string containing the classes the transcript falls under: <br/> **annotation** - Transcript matches an annotation transcript <br/> **allNew** - All the intron-junctions are novel <br/> **newFirstJunction** - the first junction is novel and at least one other junction matches an annotated transcript <br/> **newLastJunction** - the last junction is novel and at least one other junction matches an annotated transcript <br/> **newJunction** - an internal junction is novel and at least one other internal junction matches an annotated transcript <br/> **newWithin** - A novel transcript with matching junctions but is not a subset of an annotation <br/> **unsplicedNew** - A single exon transcript that doesn’t completely overlap with annotations <br/> **compatible** - Is a subset of an annotated transcript <br/> **newFirstExon** - The first exon is novel <br/> **newLastExon** - The last exon is novel|
648
650
|readCount|The number of full length reads associated with this transcript (filtered by min.readCount)|
649
-
|NDR|The NDR score calculated for the transcript|
650
651
|relReadCount|The proportion of reads this transcript has relative to all reads assigned to its gene|
651
652
|relSubsetCount|The proportion of reads this transcript has relative to all reads that either fully or partially match this transcript|
653
+
|txId|A bambu specific transcript id used for indexing purposes
654
+
|eqClassById|A integer list with the transcript ids of all equivalent transcripts
652
655
|maxTxScore|The maximum model score across samples from the sample-trained model. Used internally by Bambu to calculate NDR scores|
653
656
|maxTxScore.noFit|The maximum model score across samples from the pretrained model. Used internally by Bambu to recommend NDR thresholds|
- Subset transcripts and those above the NDR threshold are placed into the metadata of the annotations in $subsetTranscripts and $lowConfidenceTranscripts respectively (when filtered out by default).
0 commit comments