Skip to content

Commit 6cc9546

Browse files
committed
updated schedule, google link and added extra content re: file formats
1 parent cc8f394 commit 6cc9546

File tree

6 files changed

+59
-22
lines changed

6 files changed

+59
-22
lines changed

07-01-FileFormats.Rmd

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Supplementary Information
2+
3+
4+
## File Formats
5+
6+
Where can you source reference genomes and annotation files:
7+
* Ensembl database: https://asia.ensembl.org/info/data/ftp/index.html
8+
* USCS database: https://hgdownload.soe.ucsc.edu/downloads.html
9+
* NCBI database: https://www.ncbi.nlm.nih.gov/guide/howto/dwn-genome/
10+
11+
The top of an ensembl homo sapiens fasta file:
12+
13+
```{r, echo=FALSE, out.width="100%",}
14+
knitr::include_images("images/supplementary/chr_fasta_full_name.png")
15+
```
16+
17+
Fasta files will have a chromosome header line, indicated by the line starting with `>`. The header line will have the chromosome number and may contain some extra information. A minimal header can just have the chromosome number.
18+
19+
```{r, echo=FALSE, out.width="100%",}
20+
knitr::include_images("images/supplementary/chr_fasta.png")
21+
```
22+
23+
The lines following the header will contain that specific chromosome’s sequence
24+
25+
```{r, echo=FALSE, out.width="100%",}
26+
knitr::include_images("images/supplementary/fasta_seq.png")
27+
```
28+
29+
Annotation files are usually GTF or GFF3 format files. Below is a GTF file:
30+
31+
```{r, echo=FALSE, out.width="100%",}
32+
knitr::include_images("images/supplementary/gtf_file.png")
33+
```
34+
35+
A gtf file is a 'tab separated file' - this means that it is a file with columns indicated by tab spacing. A GTF file will always have 9 columns containing the following information (taken from here):
36+
37+
1. seqname - name of the chromosome or scaffold; chromosome names can be given with or without the 'chr' prefix. Note: the chromosome name format should be the same as the fasta file e.g if the fasta file has `chr1` then the gtf file should also have `chr1` in this column. If the fasta file has `1` then the gtf file should have `1` in this column.
38+
2. source - name of the program that generated this feature, or the data source (database or project name)
39+
3. feature - feature type name, e.g. Gene, Variation, Similarity
40+
4. start - Start position* of the feature, with sequence numbering starting at 1.
41+
5. end - End position* of the feature, with sequence numbering starting at 1.
42+
6. score - A floating point value.
43+
7. strand - defined as + (forward) or - (reverse).
44+
8. frame - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on..
45+
9. attribute - A semicolon-separated list of tag-value pairs, providing additional information about each feature.

images/supplementary/chr_fasta.png

354 KB
Loading
708 KB
Loading

images/supplementary/fasta_seq.png

1.52 MB
Loading

images/supplementary/gtf_file.png

500 KB
Loading

index.Rmd

Lines changed: 14 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -17,40 +17,32 @@ github-repo: https://github.com/MonashBioinformaticsPlatform/RNAseq_workshop_202
1717
# Getting started
1818

1919

20-
- **[All communications and important links will be in this Drive Document](https://docs.google.com/document/d/1jykhTx23IHoTJvu6rCcs6_cuPznlMLYQZqih_C2-kfs/edit)**
20+
- **[All communications and important links will be in this Drive Document](https://docs.google.com/document/d/1qNgOxzwidhIzHRZMx_iJVGjL0kVqW9ru4wAlBJjhViY/edit?usp=sharing)**
2121

2222

23-
- **Instructors:** Adele Barugahare, Nitika Kandhari, Scott Coutts, Andrew Perry, Paul Harrison, Laura Perlaza-Jimenez
23+
- **Instructors:** Adele Barugahare, Nitika Kandhari, Natasha Ng, Andrew Perry, Paul Harrison, Laura Perlaza-Jimenez, Giulia Iacono
2424

2525

2626
## Schedule
2727

28-
This workshop is 2 sessions long, each 4 hours
28+
This workshop is 1 full day.
2929

3030
**First Day**
3131

3232
| Time | Content |
3333
|:---:|:---:|
34-
| 10:00 | Getting started and introduction |
35-
| 10:10 | Planning an RNAseq Experiment |
36-
| 11:00 | Experimental Design |
37-
| 11:20 | 00:10 Break |
38-
| 11:30 | Library Preparation |
39-
| 12:40 | 00:20 Lunch Break |
34+
| 09:00 | Getting started and introduction |
35+
| 09:10 | Planning an RNAseq Experiment |
36+
| 09:30 | Experimental Design |
37+
| 10:00 | Break (10 mins) |
38+
| 10:10 | Library Preparation |
39+
| 11:10 | Pipeline Overview |
40+
| 12:30 | Lunch Break (30 mins) |
4041
| 13:00 | Pipeline Overview |
41-
| 14:00 | End of first session |
42-
43-
**Second Day**
44-
45-
| Time | Content |
46-
|:---:|:---:|
47-
| 10:00 | Pipeline Overview |
48-
| 11:20 | 00:10 Break |
49-
| 11:30 | Pipeline Overview |
50-
| 12:00 | Differential Expression |
51-
| 12:40 | 00:20 Lunch Break |
52-
| 13:00 | Differential Expression |
53-
| 14:00 | End of second session |
42+
| 13:30 | Differential Expression |
43+
| 14:30 | Break (10 mins) |
44+
| 14:40 | Differential Expression |
45+
| 15:50 | End remarks |
5446

5547

5648
## Summary

0 commit comments

Comments
 (0)