bioinformatics-core-shared-training
diff --git a/‎Markdowns/06_Introduction_to_RNAseq_Analysis_in_R.Rmd‎
Lines changed: 3 additions & 3 deletions b/‎Markdowns/06_Introduction_to_RNAseq_Analysis_in_R.Rmd‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎Markdowns/06_Introduction_to_RNAseq_Analysis_in_R.html‎
Lines changed: 11 additions & 7 deletions b/‎Markdowns/06_Introduction_to_RNAseq_Analysis_in_R.html‎
Lines changed: 11 additions & 7 deletions
diff --git a/‎Markdowns/07_Data_Exploration.Rmd‎
Lines changed: 3 additions & 6 deletions b/‎Markdowns/07_Data_Exploration.Rmd‎
Lines changed: 3 additions & 6 deletions
diff --git a/‎Markdowns/07_Data_Exploration.Solutions.Rmd‎
Lines changed: 1 addition & 1 deletion b/‎Markdowns/07_Data_Exploration.Solutions.Rmd‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎Markdowns/07_Data_Exploration.html‎
Lines changed: 19 additions & 16 deletions b/‎Markdowns/07_Data_Exploration.html‎
Lines changed: 19 additions & 16 deletions
diff --git a/‎Markdowns/07_Data_Exploration.pdf‎
14.3 KB b/‎Markdowns/07_Data_Exploration.pdf‎
14.3 KB
@@ -1,13 +1,12 @@
 ---
 title: "Introduction to RNAseq analysis in R"
-date: "April 2021"
+date: "May 2022"
 output:
   ioslides_presentation:
     css: css/stylesheet.css
     logo: images/CRUK_Cambridge_Institute.png
     smaller: yes
     widescreen: yes
-  beamer_presentation: default
   slidy_presentation: default
 ---
 <!--
@@ -131,7 +130,8 @@ Length, GC content, sequence
     margin-left: 27%">
 <span style="color: #2e3192;">**Library composition**</span>
 
-Highly expressed genes overrepresented at the cost of lowly expressed genes
+Quantification is relative - changes in
+relative abundance for one gene will affect the relative abundances of other genes
 
 "Composition Bias"
 
 
@@ -82,8 +82,7 @@ need for the analysis today: name, cell type, status.
 ```{r loadSampleInfo, message = FALSE}
 # Read the sample information into a data frame
 sampleinfo <- read_tsv("data/samplesheet.tsv", col_types = c("cccc"))
-sampleinfo %>% 
-  arrange(Status, TimePoint, Replicate)
+arrange(sampleinfo, Status, TimePoint, Replicate)
 ```
 
 ## Reading in the count data
@@ -102,9 +101,8 @@ The Salmon quantification results are per transcript, we'll want to summarise
 to gene level. To this we need a table that relates transcript IDs to gene IDs.
 
 ```{r readSalmon}
-files <- str_c("salmon/", sampleinfo$SampleName, "/quant.sf")
+files <- file.path("salmon", sampleinfo$SampleName, "quant.sf")
 files <- set_names(files, sampleinfo$SampleName)
-
 tx2gene <- read_tsv("references/tx2gene.tsv")
 
 txi <- tximport(files, type = "salmon", tx2gene = tx2gene)
@@ -138,8 +136,7 @@ saveRDS(txi, file = "salmon_outputs/txi.rds")
 One of the most complex aspects of learning to work with data in `R` is 
 getting to grips with subsetting and manipulating data tables. The package 
 `dplyr` [@Wickham2018] was developed to make this process more intuitive than it
-is using standard base `R` processes. It also makes use of a new symbol `%>%`,
-called the "pipe", which makes the code a bit tidier. 
+is using standard base `R` processes. 
 
 In particular we will use the commands:
 
 
@@ -25,7 +25,7 @@ library(ggfortify)
 # Read the sample information into R
 sampleinfo <- read_tsv("data/samplesheet.tsv", col_types = c("cccc"))
 # Read the data into R
-files <- str_c("salmon/", sampleinfo$SampleName, "/quant.sf")
+files <- file.path("salmon", sampleinfo$SampleName, "quant.sf")
 files <- set_names(files, sampleinfo$SampleName)
 
 tx2gene <- read_tsv("references/tx2gene.tsv")