Skip to content

Commit 16f6c74

Browse files
committed
Add catch-up script at beginning, add script at end that now includes the LRT command
1 parent e5215e4 commit 16f6c74

File tree

1 file changed

+55
-2
lines changed

1 file changed

+55
-2
lines changed

lessons/wk7_lesson01_hypothesis_testing.md

Lines changed: 55 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: "Hypothesis testing and multiple testing"
33
author: "Harvard HPC Staff, Adapted by Sally Chang @ NICHD"
4-
date: "Last Modified February 2025"
4+
date: "Last Modified May 2025"
55
---
66

77
Approximate time: 60 minutes
@@ -13,13 +13,38 @@ Approximate time: 60 minutes
1313
- Recognize the importance of multiple test correction
1414
- Identify different methods for multiple test correction
1515

16+
## Catch-Up Script
17+
18+
If you need to be completely caught up, you can copy and paste the following into an R Script and run it. If you don't already have the files in your `/data` directory, please see [Wk 5 Lesson 01](../wk5_lesson01_introR_Rstudio.md) for instructions on where to obtain the input files.
19+
20+
``` r
21+
# Setup
22+
# Bioconductor and CRAN libraries used - already installed on Biowulf
23+
library(tidyverse)
24+
library(RColorBrewer)
25+
library(DESeq2)
26+
library(pheatmap)
27+
library(BiocManager)
28+
29+
# Load in data
30+
data <- read.table("data/mov10_AllSamples_featurecounts.Rmatrix.txt", header=T, row.names=1)
31+
32+
meta <- read.table("data/mov10_AllSamples_metadata.txt", header=T, row.names=1)
33+
34+
# Create DESeq2Dataset object
35+
dds <- DESeqDataSetFromMatrix(countData = data, colData = meta, design = ~ sampletype)
36+
37+
# Run DESeq2 on DESeq2Dataset object
38+
dds <- DESeq(dds)
39+
```
40+
1641
# DESeq2: Model fitting and Hypothesis testing
1742

1843
The final step in the DESeq2 workflow is taking the counts for each gene and fitting it to the model and testing for differential expression.
1944

2045
<p align="center">
2146

22-
<img src="../img/deseq_workflow_full_2018.png" width="385" alt="deseq full workflow"/>
47+
<img src="../img/deseq_workflow_full_2018.png" alt="deseq full workflow" width="385"/>
2348

2449
</p>
2550

@@ -151,6 +176,34 @@ DESeq2 helps reduce the number of genes tested by removing those genes unlikely
151176

152177
By setting the FDR cutoff to \< 0.05, we're saying that the proportion of false positives we expect amongst our differentially expressed genes is 5%. For example, if you call 500 genes as differentially expressed with an FDR cutoff of 0.05, you expect 25 of them to be false positives.
153178

179+
## Your DE script
180+
181+
In this lesson, we took the additional step of running an additional DESeq2 analysis using a Likelihood Ratio Test to create the `dds_lrt` object . Your `de_script.R` should now contain the following commands to re-create necessary data objects (click to show):
182+
183+
``` r
184+
# Setup
185+
# Bioconductor and CRAN libraries used - already installed on Biowulf
186+
library(tidyverse)
187+
library(RColorBrewer)
188+
library(DESeq2)
189+
library(pheatmap)
190+
library(BiocManager)
191+
192+
# Load in data
193+
data <- read.table("data/mov10_AllSamples_featurecounts.Rmatrix.txt", header=T, row.names=1)
194+
195+
meta <- read.table("data/mov10_AllSamples_metadata.txt", header=T, row.names=1)
196+
197+
# Create DESeq2Dataset object
198+
dds <- DESeqDataSetFromMatrix(countData = data, colData = meta, design = ~ sampletype)
199+
200+
# Run DESeq2 on DESeq2Dataset object
201+
dds <- DESeq(dds)
202+
203+
# Likelihood ratio test
204+
dds_lrt <- DESeq(dds, test="LRT", reduced = ~ 1)
205+
```
206+
154207
------------------------------------------------------------------------
155208

156209
*This lesson has been developed by members of the teaching team at the [Harvard Chan Bioinformatics Core (HBC)](http://bioinformatics.sph.harvard.edu/). These are open access materials distributed under the terms of the [Creative Commons Attribution license](https://creativecommons.org/licenses/by/4.0/) (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.*

0 commit comments

Comments
 (0)