Skip to content

Commit ab1d82c

Browse files
committed
updates to Deseq and A+V
1 parent a270ab6 commit ab1d82c

13 files changed

+2433
-974
lines changed
File renamed without changes.
File renamed without changes.

Markdowns/10_DE_analysis_with_DESeq2.Solutions.Rmd

Lines changed: 38 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@ title: "Introduction to Bulk RNAseq data analysis"
33
subtitle: Differential Expression of RNA-seq data
44
date: '`r format(Sys.time(), "Last modified: %d %b %Y")`'
55
output:
6-
pdf_document: default
76
html_document: default
7+
pdf_document: default
88
---
99

1010
```{r setup, echo=FALSE, message=FALSE}
@@ -13,9 +13,42 @@ options(tibble.print_max = 4, tibble.print_min = 4, max.print=40,
1313
library(tidyverse)
1414
library(DESeq2)
1515
```
16-
1716
### Exercise 1
1817

18+
> Now we have made our results table using our simple model, let have a look at which
19+
> genes are changing and how many pass our 0.05 threshold. Why might this not be
20+
> straigtforward?
21+
>
22+
>
23+
24+
```{r eval=FALSE}
25+
sum(results.simple$padj < 0.05)
26+
```
27+
28+
```{r eval=FALSE}
29+
sum(is.na(results.simple$padj))
30+
```
31+
>
32+
> a) how many genes are significantly (with an FDR < 0.05) up-regulated?
33+
>
34+
>
35+
>
36+
37+
```{r eval=FALSE}
38+
sum(results.simple$padj < 0.05 & results.simple$log2FoldChange > 0, na.rm = TRUE)
39+
```
40+
>
41+
> b) how many genes are significantly (with an FDR < 0.05) down-regulated?
42+
>
43+
>
44+
45+
```{r eval=FALSE}
46+
sum(results.simple$padj < 0.05 & results.simple$log2FoldChange < 0, na.rm = TRUE)
47+
48+
```
49+
50+
### Exercise 2
51+
1952
> So far we have fitted a simple model considering just "Status", but in reality
2053
> we want to model the effects of both "Status" and "Time Point".
2154
>
@@ -84,7 +117,7 @@ sum(results.additive$padj < 0.05, na.rm = TRUE)
84117
```
85118

86119

87-
### Exercise 2
120+
### Exercise 3
88121

89122
> If we want a different contrast we can just pass the `results` function the
90123
> **name** of the contrast, as given by `resultsNames(ddsObj)`.
@@ -100,7 +133,7 @@ results.d33vd11 <- results(ddsObj, name= "TimePoint_d33_vs_d11", alpha=0.05)
100133
sum(results.d33vd11$padj < 0.05, na.rm = TRUE)
101134
```
102135

103-
### Exercise 3
136+
### Exercise 4
104137
>
105138
> When we looked at the PCA it did seem that an interaction model might be
106139
> warranted. Let's test that.
@@ -143,7 +176,7 @@ results.Interaction_v_Additive <- results(ddsObj.LRT)
143176
table(results.Interaction_v_Additive$padj < 0.05)
144177
```
145178

146-
### Exercise 4
179+
### Exercise 5
147180
>
148181
> Let's investigate the uninfected mice
149182
>

Markdowns/10_DE_analysis_with_DESeq2.Solutions.html

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -165,14 +165,34 @@
165165

166166
<h1 class="title toc-ignore">Introduction to Bulk RNAseq data analysis</h1>
167167
<h3 class="subtitle">Differential Expression of RNA-seq data</h3>
168-
<h4 class="date">Last modified: 19 Apr 2021</h4>
168+
<h4 class="date">Last modified: 28 Nov 2021</h4>
169169

170170
</div>
171171

172172

173173
<div id="exercise-1" class="section level3">
174174
<h3>Exercise 1</h3>
175175
<blockquote>
176+
<p>Now we have made our results table using our simple model, let have a look at which genes are changing and how many pass our 0.05 threshold. Why might this not be straigtforward?</p>
177+
</blockquote>
178+
<pre class="r"><code>sum(results.simple$padj &lt; 0.05)</code></pre>
179+
<pre class="r"><code>sum(is.na(results.simple$padj))</code></pre>
180+
<blockquote>
181+
<ol style="list-style-type: lower-alpha">
182+
<li>how many genes are significantly (with an FDR &lt; 0.05) up-regulated?</li>
183+
</ol>
184+
</blockquote>
185+
<pre class="r"><code>sum(results.simple$padj &lt; 0.05 &amp; results.simple$log2FoldChange &gt; 0, na.rm = TRUE)</code></pre>
186+
<blockquote>
187+
<ol start="2" style="list-style-type: lower-alpha">
188+
<li>how many genes are significantly (with an FDR &lt; 0.05) down-regulated?</li>
189+
</ol>
190+
</blockquote>
191+
<pre class="r"><code>sum(results.simple$padj &lt; 0.05 &amp; results.simple$log2FoldChange &lt; 0, na.rm = TRUE)</code></pre>
192+
</div>
193+
<div id="exercise-2" class="section level3">
194+
<h3>Exercise 2</h3>
195+
<blockquote>
176196
<p>So far we have fitted a simple model considering just “Status”, but in reality we want to model the effects of both “Status” and “Time Point”.</p>
177197
<p>Let’s start with the model with only main effects - an additive model with no interaction. The main assumption here is that the effects of Status and the effects of Time Point are indepedent.</p>
178198
<p>Recapitulate the above steps to generate a new DESeq2 object with the additive model. Then we will extract the results table as above.</p>
@@ -248,8 +268,8 @@ <h4 id="filter-the-data-set">Filter the data set:</h4>
248268
<pre class="r"><code>sum(results.additive$padj &lt; 0.05, na.rm = TRUE)</code></pre>
249269
<pre><code>## [1] 2766</code></pre>
250270
</div>
251-
<div id="exercise-2" class="section level3">
252-
<h3>Exercise 2</h3>
271+
<div id="exercise-3" class="section level3">
272+
<h3>Exercise 3</h3>
253273
<blockquote>
254274
<p>If we want a different contrast we can just pass the <code>results</code> function the <strong>name</strong> of the contrast, as given by <code>resultsNames(ddsObj)</code>. Look at the help page for the <code>results</code> command to see how to do this.</p>
255275
<ol style="list-style-type: decimal">
@@ -265,8 +285,8 @@ <h3>Exercise 2</h3>
265285
<pre class="r"><code>sum(results.d33vd11$padj &lt; 0.05, na.rm = TRUE)</code></pre>
266286
<pre><code>## [1] 109</code></pre>
267287
</div>
268-
<div id="exercise-3" class="section level3">
269-
<h3>Exercise 3</h3>
288+
<div id="exercise-4" class="section level3">
289+
<h3>Exercise 4</h3>
270290
<blockquote>
271291
<p>When we looked at the PCA it did seem that an interaction model might be warranted. Let’s test that.</p>
272292
<ol style="list-style-type: decimal">
@@ -323,8 +343,8 @@ <h3>Exercise 3</h3>
323343
## FALSE TRUE
324344
## 16474 455</code></pre>
325345
</div>
326-
<div id="exercise-4" class="section level3">
327-
<h3>Exercise 4</h3>
346+
<div id="exercise-5" class="section level3">
347+
<h3>Exercise 5</h3>
328348
<blockquote>
329349
<p>Let’s investigate the uninfected mice</p>
330350
<ol style="list-style-type: decimal">

Markdowns/10_DE_analysis_with_DESeq2_part2.Rmd

Lines changed: 32 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ title: "Introduction to Bulk RNAseq data analysis"
33
subtitle: Differential Expression of RNA-seq data - Part 2
44
date: '`r format(Sys.time(), "Last modified: %d %b %Y")`'
55
output:
6-
html_document:
7-
toc: yes
86
pdf_document:
97
toc: yes
8+
html_document:
9+
toc: yes
1010
bibliography: ref.bib
1111
---
1212

@@ -15,7 +15,33 @@ options(tibble.print_max = 4, tibble.print_min = 4, max.print=40,
1515
tibble.max_extra_cols=2)
1616
```
1717

18-
### Exercise 1
18+
# Recap from last week
19+
20+
```{r message = FALSE}
21+
library(DESeq2)
22+
library(tidyverse)
23+
```
24+
25+
```{r loadData}
26+
txi <- readRDS("RObjects/txi.rds")
27+
sampleinfo <- read_tsv("data/samplesheet_corrected.tsv", col_types="cccc") %>%
28+
mutate(Status = fct_relevel(Status, "Uninfected"))
29+
30+
simple.model <- as.formula(~ Status)
31+
32+
ddsObj.raw <- DESeqDataSetFromTximport(txi = txi,
33+
colData = sampleinfo,
34+
design = simple.model)
35+
36+
keep <- rowSums(counts(ddsObj.raw)) > 5
37+
ddsObj.filt <- ddsObj.raw[keep,]
38+
39+
ddsObj <- DESeq(ddsObj.filt)
40+
41+
results.simple <- results(ddsObj, alpha=0.05)
42+
```
43+
44+
### Exercise 2
1945

2046
> So far we have fitted a simple model considering just "Status", but in reality
2147
> we want to model the effects of both "Status" and "Time Point".
@@ -115,7 +141,7 @@ topGenesIvU <- as.data.frame(results.InfectedvUninfected) %>%
115141
topGenesIvU
116142
```
117143

118-
### Exercise 2
144+
### Exercise 3
119145

120146
> If we want a different contrast we can just pass the `results` function the
121147
> **name** of the contrast, as given by `resultsNames(ddsObj)`.
@@ -191,7 +217,7 @@ to all genes. Curiously then, this suggests that overall the simple model
191217
is more appropriate than the additive model. Let's look into the interaction
192218
model.
193219

194-
### Exercise 3
220+
### Exercise 4
195221
>
196222
> When we looked at the PCA it did seem that an interaction model might be
197223
> warranted. Let's test that.
@@ -283,7 +309,7 @@ sum(results.interaction.33$padj < 0.05, na.rm = TRUE)
283309
We can see that there is a strong difference in the effects of infection on
284310
gene expression between days 11 and 33.
285311

286-
### Exercise 4
312+
### Exercise 5
287313
>
288314
> Let's investigate the uninfected mice
289315
>

Markdowns/10_DE_analysis_with_DESeq2_part2.html

Lines changed: 469 additions & 0 deletions
Large diffs are not rendered by default.
270 KB
Binary file not shown.

0 commit comments

Comments
 (0)