You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each row represents a fastq file (single-end) or a pair of fastq files (paired end). Rows with the same sample identifier are considered technical replicates and merged automatically. The strandedness refers to the library preparation and will be automatically inferred if set to `auto`.
78
78
79
-
The pipeline also supports providing pre-aligned BAM files from previous runs as input by using the optional `genome_bam` and `transcriptome_bam` columns in the samplesheet. This is particularly useful for reprocessing data or running downstream analysis steps without repeating the computationally expensive alignment step. When using `--save_align_intermeds`, the pipeline generates a complete samplesheet with BAM paths for convenient future reanalysis.
79
+
The pipeline supports a two-step reprocessing workflow using BAM files from previous runs. Run initially with `--save_align_intermeds` to generate a samplesheet with BAM paths, then reprocess using `--skip_alignment` for efficient downstream analysis without repeating expensive alignment steps. This feature is designed specifically for pipeline-generated BAMs.
80
80
81
81
> [!WARNING]
82
82
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).
Copy file name to clipboardExpand all lines: docs/usage.md
+43-31Lines changed: 43 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,48 +106,60 @@ An [example samplesheet](../assets/samplesheet.csv) has been provided with the p
106
106
107
107
> **NB:** The `group` and `replicate` columns were replaced with a single `sample` column as of v3.1 of the pipeline. The `sample` column is essentially a concatenation of the `group` and `replicate` columns, however it now also offers more flexibility in instances where replicate information is not required e.g. when sequencing clinical samples. If all values of `sample` have the same number of underscores, fields defined by these underscore-separated names may be used in the PCA plots produced by the pipeline, to regain the ability to represent different groupings.
108
108
109
-
### Using BAM files as input
109
+
### BAM input for reprocessing workflow
110
110
111
-
The pipeline supports providing pre-aligned BAM files as input instead of, or in addition to, FASTQ files. This functionality is primarily designed for reusing BAM files generated by previous runs of this pipeline, allowing you to:
111
+
The pipeline supports a **two-step workflow** for efficient reprocessing without expensive alignment steps. This feature is designed specifically for re-running with BAM files generated by previous runs of this same pipeline.
112
112
113
-
- Skip computationally expensive alignment steps when reprocessing data
114
-
- Run downstream analysis and QC steps on existing alignments
115
-
- Process a mix of newly sequenced samples (FASTQ) and previously processed samples (BAM)
113
+
#### Step 1: Initial run with BAM generation
116
114
117
-
To use BAM files as input, add the optional `genome_bam` and/or `transcriptome_bam` columns to your samplesheet:
115
+
Run the pipeline normally, adding `--save_align_intermeds` to publish BAM files and generate a reusable samplesheet:
This creates `samplesheets/samplesheet_with_bams.csv` containing paths to the generated BAM files.
126
126
127
-
- BAM files should preferably come from previous runs of this pipeline to ensure compatibility
128
-
- The pipeline will automatically index provided BAM files
129
-
- You can provide just `genome_bam`, just `transcriptome_bam`, or both
130
-
- When using BAM input, you can leave the FASTQ columns empty or omit them
131
-
- Mixed samplesheets (some samples with FASTQ, others with BAM) are supported
132
-
- For BAM file locations from pipeline outputs, see the [output documentation](https://nf-co.re/rnaseq/output)
133
-
-**Automated samplesheet generation**: When using `--save_align_intermeds`, the pipeline automatically generates a `samplesheet_with_bams.csv` file in the `samplesheets/` directory containing all samples with their BAM file paths. For FASTQ-derived samples, this includes paths to newly generated BAMs; for BAM input samples, it preserves the original input paths. This complete samplesheet can be used directly for future pipeline runs
127
+
#### Step 2: Reprocessing run with BAM input
128
+
129
+
Use the auto-generated samplesheet to reprocess data, skipping alignment:
130
+
131
+
```bash
132
+
nextflow run nf-core/rnaseq \
133
+
--input samplesheets/samplesheet_with_bams.csv \
134
+
--skip_alignment \
135
+
--outdir results_reprocessed \
136
+
-profile docker
137
+
```
138
+
139
+
The pipeline will skip alignment and indexing steps, putting the BAM files through post-processing and quantification only.
134
140
135
-
###Reprocessing workflow with BAM input
141
+
#### Example of generated samplesheet
136
142
137
-
When reprocessing data using the auto-generated `samplesheet_with_bams.csv`from a previous run:
143
+
The `samplesheet_with_bams.csv`will look like:
138
144
139
-
1.**Use the generated samplesheet**: The `samplesheet_with_bams.csv` contains all necessary BAM file paths
140
-
2.**Skip alignment steps**: Add `--skip_alignment` to prevent unnecessary index generation and alignment processing
> **⚠️ Warning**: This feature is designed specifically for BAM files generated by this pipeline. Using arbitrary BAM files from other sources is **not officially supported** and will likely only work via the two-step workflow described above. Users attempting to use other BAMs do so at their own risk.
154
+
155
+
**Key technical details:**
156
+
157
+
- The pipeline automatically indexes provided BAM files
158
+
- You can provide just `genome_bam`, just `transcriptome_bam`, or both
159
+
- Mixed samplesheets (some samples with FASTQ, others with BAM) are supported
160
+
- For BAM file locations from pipeline outputs, see the [output documentation](https://nf-co.re/rnaseq/output)
149
161
150
-
This approach allows you to efficiently reprocess data for downstream analysis (quantification, differential expression, QC) without repeating the time-consuming alignment steps.
162
+
This workflow is ideal for tweaking downstream processing steps (quantification methods, QC parameters, differential expression analysis) without repeating time-consuming alignment.
0 commit comments