You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A sequencing machine generates a readout of the nucleotides that make up the original RNA molecule, usually by using uniquely labelled flourescent nucleotides. Every time one of these nucleotides are added to a strand of DNA, they emit a different colored light which is captured by the sequencing machine.
(ref:foo2) Illumina Sequencing by Synthesis. Every time a new fluorescently labelled nucleotide is added to a DNA strand, the fluorescence emitted by the nucleotide is captured. The image is then later converted to nucleotide sequence. Image source: [Microbenotes: Illumina Sequencing](https://microbenotes.com/illumina-sequencing/)
(ref:foo2) Illumina Sequencing by Synthesis. Every time a new fluorescently labelled nucleotide is added to a DNA strand, the fluorescence emitted by the nucleotide is captured. Image source: [Microbenotes: Illumina Sequencing](https://microbenotes.com/illumina-sequencing/)
23
25
24
26
25
27
A diverse ecosystem of protocols and technology exist that can be used to generate RNAseq data, which can be used in a wide variety of applications.
@@ -178,7 +180,7 @@ Long read sequencing typically refers to technology that produce reads with leng
178
180
RNAseq data can be used for a variety of purposes. Broadly speaking, it can be used in 2 different ways:
179
181
180
182
- qualitatively: *what* is expressed? (e.g genes, isoforms, specific exons, intron retention, gene fusions, etc). RNAseq provides *annotation* information.
181
-
- quantatively: how *much* is expressed? Usually we want to know if the abundance of a gene has changed in response to a variable. RNAseq provides *expression* information. This is the most common use of RNAseq data.
183
+
- quantatively: how *much* is expressed? Usually we want to know if the *abundance* of a gene has changed in response to a variable. This is the most common use of RNAseq data.
182
184
183
185
This capability for simultaneous discovery and quantification at the whole transcriptome level is a key reason that cemented RNAseq as the technology of choice for studying RNA. Previous technologies such as microarrays used pre-defined probes based on known genes thus limiting their ability to discover new genes.
184
186
@@ -258,7 +260,7 @@ If interested in transcript changes, the complexity of the analyses can be deter
258
260
Long read sequencing can be more suited for isoform/transcript level analyses than short read sequencing as there is less or no ambiguity to which isofrom a read comes from, if the entire transcript has been sequenced.
259
261
260
262
261
-
(ref:foo16) Shrt reads can be amigious as to which isoform they belong to. Long reads have less uncertainty. Image source: [Improving gene isoform quantification with miniQuant, 2025](https://www.nature.com/articles/s41587-025-02633-9)
263
+
(ref:foo16) Short reads can be ambigious as to which isoform they belong to. Long reads have less uncertainty. Image source: [Improving gene isoform quantification with miniQuant, 2025](https://www.nature.com/articles/s41587-025-02633-9)
Most RNAseq datasets are used for quantative analysis - the data is aligned to a pre-existing reference genome and pre-existing annotations. However, these resources do not exist for every organism. When a reference genome is either unavailable or the reference available is not adequate, it is possible to take RNAseq reads and assemble them into a transcriptome of the assayed organism. In such situations, RNAseq data has a dual purpose - the reference is built from the sequences of the reads and then the reads are counted against the transcriptome for differential analysis.
275
+
Most RNAseq datasets are used for quantative analysis - the data is aligned to a pre-existing reference genome and pre-existing annotations. However, these resources do not exist for every organism. When a reference genome is either unavailable or the reference available is not adequate, it is possible to take RNAseq reads and assemble them into a transcriptome of the assayed organism. In such situations, RNAseq data has a dual purpose - the reference transcriptome is built from the sequences of the reads and then the reads are counted against the transcriptome for differential analysis.
274
276
275
277
Assembling a transcriptome can be done in 2 ways, there are reference based methods and de novo methods. Reference based methods use the reference of either the organism or a closely related species.
276
278
277
279
De novo assembly methods are reference free - this is useful when studying non-model organisms as often they lack well annotated reference genomes.
278
280
279
-
Long read sequencing is better suited to building high quality transcriptomes than short read sequencing as the longer read length retains more information and context about the original RNA molecule. A commonly used analogy when describing assembly is to imagine the genome as a puzzle that needs to be put back together. The smaller the pieces, the more difficult the puzzle. A puzzle made of larger pieces is much easier to put back together and reconstruct the original RNA sequence.
281
+
Long read sequencing is better suited to building high quality transcriptomes than short read sequencing as the longer read length retains more information and context about the original RNA molecule. A commonly used analogy when describing assembly is to imagine the genome/transcriptome as a puzzle that needs to be put back together. The smaller the pieces, the more difficult the puzzle. A puzzle made of larger pieces is much easier to put back together and reconstruct the original RNA sequence.
280
282
281
283
(ref:foo17) Strategies for reconstructing transcripts from RNA-Seq reads. Image source: [Advancing RNAseq Analysis (2010)](https://www.nature.com/articles/nbt0510-421)
282
284
@@ -291,7 +293,7 @@ Transcriptome assembly is usually performed in situations when working with orga
291
293
292
294
This also has clinical applications. One such application is the detection of fusion genes - these can arise due to chromosomal rearrangements combining the coding regions of two genes. These genes can produce aberrant proteins and lead to cancer development if the fused genes are oncogenes or tumor suppresor genes. Therefore, detection of fusion genes can be an important diagnostic tool in clinical settings as well as for cancer research.
293
295
294
-
### Optional Discussion: Design A Bulk RNAseq Experiment {- .challenge}
296
+
####Optional Discussion: Design A Bulk RNAseq Experiment {- .challenge}
295
297
296
298
You want to examine the impact of several different growth conditions in a specific bacterial strain and you are interested mostly in changes to gene expression. A side goal of your project is to look at small RNA changes. Assuming you have no limitations on your budget, what are some considerations you'd have in designing a potential RNAseq experiment for this project?
0 commit comments