Skip to content

Commit ba216f4

Browse files
committed
Consistent naming for summarizedexperiment outputs
1 parent 0a3f20b commit ba216f4

16 files changed

+89
-51
lines changed

docs/output.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,21 @@ When `--remove_ribo_rna` is specified, the pipeline uses [SortMeRNA](https://git
215215
- `star_salmon/`
216216
- `*.Aligned.out.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the reference genome will be placed in this directory.
217217
- `*.Aligned.toTranscriptome.out.bam`: If `--save_align_intermeds` is specified the original BAM file containing read alignments to the transcriptome will be placed in this directory.
218+
- `salmon.merged.gene_counts.tsv`: Matrix of gene-level raw counts across all samples.
219+
- `salmon.merged.gene_tpm.tsv`: Matrix of gene-level TPM values across all samples.
220+
- `salmon.merged.gene.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated counts (`counts`) and gene length (`length`), estimated library size-scaled counts (`counts_scaled`), estimated length-scaled counts (`counts_length_scaled`) in the assays slot for genes.
221+
- `salmon.merged.gene_lengths.tsv`: Matrix of average within-sample transcript lengths for each gene across all samples.
222+
- `salmon.merged.gene_counts_scaled.tsv`: Matrix of gene-level library size-scaled estimated counts across all samples.
223+
- `salmon.merged.gene_counts_length_scaled.tsv`: Matrix of gene-level length-scaled estimated counts across all samples.
224+
- `salmon.merged.transcript_counts.tsv`: Matrix of isoform-level raw counts across all samples.
225+
- `salmon.merged.transcript_tpm.tsv`: Matrix of isoform-level TPM values across all samples.
226+
- `tx2gene.tsv`: Tab-delimited file containing gene to transcripts ids mappings.
227+
- `salmon.merged.transcript.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated isoform-level raw counts (`counts`) and transcript length (`length`) in the assays slot for transcripts.
228+
- `star_salmon/<SAMPLE>/`
229+
- `quant.sf`: Salmon transcript-level quantification results.
230+
- `quant.genes.sf`: Salmon gene-level quantification results.
231+
- `star_salmon/<SAMPLE>/logs/`
232+
- `salmon_quant.log`: Salmon quantification log file.
218233
- `star_salmon/log/`
219234
- `*.SJ.out.tab`: File containing filtered splice junctions detected after mapping the reads.
220235
- `*.Log.final.out`: STAR alignment report containing the mapping results summary.
@@ -224,6 +239,23 @@ When `--remove_ribo_rna` is specified, the pipeline uses [SortMeRNA](https://git
224239

225240
</details>
226241

242+
:::tip
243+
You can access specific assay matrices from the `SummarizedExperiment` RDS object with the following R code:
244+
:::
245+
246+
```r
247+
library(SummarizedExperiment)
248+
249+
# Load the RDS object
250+
se <- readRDS("salmon.merged.gene.SummarizedExperiment.rds")
251+
252+
# View available assays
253+
assayNames(se)
254+
255+
# Access a specific assay, e.g., length-scaled counts
256+
assay(se, "counts_length_scaled")
257+
```
258+
227259
[STAR](https://github.com/alexdobin/STAR) is a read aligner designed for splice aware mapping typical of RNA sequencing data. STAR stands for *S*pliced *T*ranscripts *A*lignment to a *R*eference, and has been shown to have high accuracy and outperforms other aligners by more than a factor of 50 in mapping speed, but it is memory intensive. Using `--aligner star_salmon` is the default alignment and quantification option.
228260

229261
The STAR section of the MultiQC report shows a bar plot with alignment rates: good samples should have most reads as _Uniquely mapped_ and few _Unmapped_ reads.
@@ -728,14 +760,14 @@ The principal output files are the same between Salmon and Kallisto:
728760
- `<pseudo_aligner>/`
729761
- `<pseudo_aligner>.merged.gene_counts.tsv`: Matrix of gene-level raw counts across all samples.
730762
- `<pseudo_aligner>.gene_tpm.tsv`: Matrix of gene-level TPM values across all samples.
731-
- `all_samples_gene.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated counts (`counts`) and gene length (`length`), estimated library size-scaled counts (`counts_scaled`), estimated length-scaled counts (`counts_length_scaled`) in the assays slot for genes.
763+
- `<pseudo_aligner>.merged.gene.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated counts (`counts`) and gene length (`length`), estimated library size-scaled counts (`counts_scaled`), estimated length-scaled counts (`counts_length_scaled`) in the assays slot for genes.
732764
- `<pseudo_aligner>.merged.gene_lengths.tsv`: Matrix of average within-sample transcript lengths for each gene across all samples.
733765
- `<pseudo_aligner>.merged.gene_counts_scaled.tsv`: Matrix of gene-level library size-scaled estimated counts across all samples.
734766
- `<pseudo_aligner>.merged.gene_counts_length_scaled.tsv`: Matrix of gene-level length-scaled estimated counts across all samples.
735767
- `<pseudo_aligner>.merged.transcript_counts.tsv`: Matrix of isoform-level raw counts across all samples.
736768
- `<pseudo_aligner>.merged.transcript_tpm.tsv`: Matrix of isoform-level TPM values across all samples.
737769
- `tx2gene.tsv`: Tab-delimited file containing gene to transcripts ids mappings.
738-
- `all_samples_transcript.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated isoform-level raw counts (`counts`) and transcript length (`length`) in the assays slot for transcripts.
770+
- `<pseudo_aligner>.merged.transcript.SummarizedExperiment.rds`: RDS object that can be loaded in R that contains a [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) container with the abundance TPM (`tpm`), estimated isoform-level raw counts (`counts`) and transcript length (`length`) in the assays slot for transcripts.
739771

740772
:::tip
741773
You can access specific assay matrices from the `SummarizedExperiment` RDS object with the following R code:
@@ -745,7 +777,7 @@ You can access specific assay matrices from the `SummarizedExperiment` RDS objec
745777
library(SummarizedExperiment)
746778

747779
# Load the RDS object
748-
se <- readRDS("all_samples_gene.SummarizedExperiment.rds")
780+
se <- readRDS("salmon.merged.gene.SummarizedExperiment.rds") # or kallisto.merged.gene.SummarizedExperiment.rds
749781

750782
# View available assays
751783
assayNames(se)

tests/default.nf.test.snap

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -699,8 +699,8 @@
699699
"salmon/WT_REP2/logs/salmon_quant.log",
700700
"salmon/WT_REP2/quant.genes.sf",
701701
"salmon/WT_REP2/quant.sf",
702-
"salmon/all_samples_gene.SummarizedExperiment.rds",
703-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
702+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
703+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
704704
"salmon/deseq2_qc",
705705
"salmon/deseq2_qc/R_sessionInfo.log",
706706
"salmon/deseq2_qc/deseq2.dds.RData",
@@ -1227,8 +1227,8 @@
12271227
"star_salmon/salmon.merged.transcript_counts.tsv",
12281228
"star_salmon/salmon.merged.transcript_lengths.tsv",
12291229
"star_salmon/salmon.merged.transcript_tpm.tsv",
1230-
"star_salmon/salmon.merged_gene.SummarizedExperiment.rds",
1231-
"star_salmon/salmon.merged_transcript.SummarizedExperiment.rds",
1230+
"star_salmon/salmon.merged.gene.SummarizedExperiment.rds",
1231+
"star_salmon/salmon.merged.transcript.SummarizedExperiment.rds",
12321232
"star_salmon/samtools_stats",
12331233
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.flagstat",
12341234
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.idxstats",

tests/featurecounts_group_type.nf.test.snap

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -686,8 +686,8 @@
686686
"salmon/WT_REP2/logs/salmon_quant.log",
687687
"salmon/WT_REP2/quant.genes.sf",
688688
"salmon/WT_REP2/quant.sf",
689-
"salmon/all_samples_gene.SummarizedExperiment.rds",
690-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
689+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
690+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
691691
"salmon/deseq2_qc",
692692
"salmon/deseq2_qc/R_sessionInfo.log",
693693
"salmon/deseq2_qc/deseq2.dds.RData",
@@ -1193,8 +1193,8 @@
11931193
"star_salmon/salmon.merged.transcript_counts.tsv",
11941194
"star_salmon/salmon.merged.transcript_lengths.tsv",
11951195
"star_salmon/salmon.merged.transcript_tpm.tsv",
1196-
"star_salmon/salmon.merged_gene.SummarizedExperiment.rds",
1197-
"star_salmon/salmon.merged_transcript.SummarizedExperiment.rds",
1196+
"star_salmon/salmon.merged.gene.SummarizedExperiment.rds",
1197+
"star_salmon/salmon.merged.transcript.SummarizedExperiment.rds",
11981198
"star_salmon/samtools_stats",
11991199
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.flagstat",
12001200
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.idxstats",

tests/hisat2.nf.test.snap

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1161,8 +1161,8 @@
11611161
"salmon/WT_REP2/logs/salmon_quant.log",
11621162
"salmon/WT_REP2/quant.genes.sf",
11631163
"salmon/WT_REP2/quant.sf",
1164-
"salmon/all_samples_gene.SummarizedExperiment.rds",
1165-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
1164+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
1165+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
11661166
"salmon/deseq2_qc",
11671167
"salmon/deseq2_qc/R_sessionInfo.log",
11681168
"salmon/deseq2_qc/deseq2.dds.RData",

tests/kallisto.nf.test.snap

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -137,8 +137,8 @@
137137
"kallisto/WT_REP2/abundance.tsv",
138138
"kallisto/WT_REP2/kallisto_quant.log",
139139
"kallisto/WT_REP2/run_info.json",
140-
"kallisto/all_samples_gene.SummarizedExperiment.rds",
141-
"kallisto/all_samples_transcript.SummarizedExperiment.rds",
140+
"kallisto/kallisto.merged.gene.SummarizedExperiment.rds",
141+
"kallisto/kallisto.merged.transcript.SummarizedExperiment.rds",
142142
"kallisto/kallisto.merged.gene_counts.tsv",
143143
"kallisto/kallisto.merged.gene_counts_length_scaled.tsv",
144144
"kallisto/kallisto.merged.gene_counts_scaled.tsv",

tests/min_mapped_reads.nf.test.snap

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -602,8 +602,8 @@
602602
"salmon/WT_REP2/logs/salmon_quant.log",
603603
"salmon/WT_REP2/quant.genes.sf",
604604
"salmon/WT_REP2/quant.sf",
605-
"salmon/all_samples_gene.SummarizedExperiment.rds",
606-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
605+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
606+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
607607
"salmon/deseq2_qc",
608608
"salmon/deseq2_qc/R_sessionInfo.log",
609609
"salmon/deseq2_qc/deseq2.dds.RData",
@@ -984,8 +984,8 @@
984984
"star_salmon/salmon.merged.transcript_counts.tsv",
985985
"star_salmon/salmon.merged.transcript_lengths.tsv",
986986
"star_salmon/salmon.merged.transcript_tpm.tsv",
987-
"star_salmon/salmon.merged_gene.SummarizedExperiment.rds",
988-
"star_salmon/salmon.merged_transcript.SummarizedExperiment.rds",
987+
"star_salmon/salmon.merged.gene.SummarizedExperiment.rds",
988+
"star_salmon/salmon.merged.transcript.SummarizedExperiment.rds",
989989
"star_salmon/samtools_stats",
990990
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.flagstat",
991991
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.idxstats",

tests/nofasta.nf.test.snap

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -324,8 +324,8 @@
324324
"salmon/WT_REP2/logs/salmon_quant.log",
325325
"salmon/WT_REP2/quant.genes.sf",
326326
"salmon/WT_REP2/quant.sf",
327-
"salmon/all_samples_gene.SummarizedExperiment.rds",
328-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
327+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
328+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
329329
"salmon/deseq2_qc",
330330
"salmon/deseq2_qc/R_sessionInfo.log",
331331
"salmon/deseq2_qc/deseq2.dds.RData",

tests/remove_ribo_rna.nf.test.snap

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -612,8 +612,8 @@
612612
"salmon/WT_REP2/logs/salmon_quant.log",
613613
"salmon/WT_REP2/quant.genes.sf",
614614
"salmon/WT_REP2/quant.sf",
615-
"salmon/all_samples_gene.SummarizedExperiment.rds",
616-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
615+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
616+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
617617
"salmon/deseq2_qc",
618618
"salmon/deseq2_qc/R_sessionInfo.log",
619619
"salmon/deseq2_qc/deseq2.dds.RData",
@@ -1146,8 +1146,8 @@
11461146
"star_salmon/salmon.merged.transcript_counts.tsv",
11471147
"star_salmon/salmon.merged.transcript_lengths.tsv",
11481148
"star_salmon/salmon.merged.transcript_tpm.tsv",
1149-
"star_salmon/salmon.merged_gene.SummarizedExperiment.rds",
1150-
"star_salmon/salmon.merged_transcript.SummarizedExperiment.rds",
1149+
"star_salmon/salmon.merged.gene.SummarizedExperiment.rds",
1150+
"star_salmon/salmon.merged.transcript.SummarizedExperiment.rds",
11511151
"star_salmon/samtools_stats",
11521152
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.flagstat",
11531153
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.idxstats",

tests/salmon.nf.test.snap

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -272,8 +272,8 @@
272272
"salmon/WT_REP2/logs/salmon_quant.log",
273273
"salmon/WT_REP2/quant.genes.sf",
274274
"salmon/WT_REP2/quant.sf",
275-
"salmon/all_samples_gene.SummarizedExperiment.rds",
276-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
275+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
276+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
277277
"salmon/salmon.merged.gene_counts.tsv",
278278
"salmon/salmon.merged.gene_counts_length_scaled.tsv",
279279
"salmon/salmon.merged.gene_counts_scaled.tsv",

tests/sentieon_default.nf.test.snap

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -697,8 +697,8 @@
697697
"salmon/WT_REP2/logs/salmon_quant.log",
698698
"salmon/WT_REP2/quant.genes.sf",
699699
"salmon/WT_REP2/quant.sf",
700-
"salmon/all_samples_gene.SummarizedExperiment.rds",
701-
"salmon/all_samples_transcript.SummarizedExperiment.rds",
700+
"salmon/salmon.merged.gene.SummarizedExperiment.rds",
701+
"salmon/salmon.merged.transcript.SummarizedExperiment.rds",
702702
"salmon/deseq2_qc",
703703
"salmon/deseq2_qc/R_sessionInfo.log",
704704
"salmon/deseq2_qc/deseq2.dds.RData",
@@ -1225,8 +1225,8 @@
12251225
"star_salmon/salmon.merged.transcript_counts.tsv",
12261226
"star_salmon/salmon.merged.transcript_lengths.tsv",
12271227
"star_salmon/salmon.merged.transcript_tpm.tsv",
1228-
"star_salmon/salmon.merged_gene.SummarizedExperiment.rds",
1229-
"star_salmon/salmon.merged_transcript.SummarizedExperiment.rds",
1228+
"star_salmon/salmon.merged.gene.SummarizedExperiment.rds",
1229+
"star_salmon/salmon.merged.transcript.SummarizedExperiment.rds",
12301230
"star_salmon/samtools_stats",
12311231
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.flagstat",
12321232
"star_salmon/samtools_stats/RAP1_IAA_30M_REP1.markdup.sorted.bam.idxstats",

0 commit comments

Comments
 (0)