-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path00.sample_sequencing_info.Rmd
More file actions
75 lines (48 loc) · 2.57 KB
/
00.sample_sequencing_info.Rmd
File metadata and controls
75 lines (48 loc) · 2.57 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
title: "Samples and sequencing information"
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE, fig.retina = 2)
library(tidyverse)
library(ggpubr)
#library(ggbreak)
library(knitr)
source("scripts/my_color.R")
```
```{r map}
include_graphics("figures/Atenius-sampling.png")
```
*Acropora kenti* corals were collected from both 5 inshore and 4 offshore locations on the Great Barrier Reef (GBR).
- Inshore reefs
- Magnetic Island (MI, n=28)
- Pandora Reef (PR, n=30)
- Pelorus Island (PI, n=30)
- Dunk Island (DI, n=30)
- Fitzroy Island (FI, n=30)
- Offshore reefs
- Arlington Reef (ARL, n=20)
- Taylor Reef (TAY, n=20)
- Rib Reef (RIB, n=20)
- John Brewer Reef (JB, n=20)
The extracted DNA was sequenced at shallow depth **(~2-5x)** with 100bp paired sequences. Some samples were sequenced in multiple lines or flowcells, thus, one sample has several fastq files.
> Two samples (FI-1-3, MI-1-4) were sequenced at much higher depth ~20x. Aside from some specific analyses (eg structural variant dection) these files were down-sampled to average depth based on random selection of mapped reads from their bam files.
Data were sequenced in two batches. The first batch is from a previously [published study](https://www.science.org/doi/full/10.1126/sciadv.abc6318). The second was specifically sequenced for the present study.
- **Data of inshore reefs from Cooke et al 2020**
Each sample was sequenced in 2 lanes with paired-end reads, thus, 592 fastq files were generated from 148 samples.
- **Data of offshore reefs. This study**
80 samples, each with 2 lanes and 2 batches, from 640 fastq files.
List of filenames and sample info table can be found [here](https://docs.google.com/spreadsheets/d/1sArk4d6xUXZzDHxBPvQEfcudYWERo7W7Xp-Va3FbQeE/edit?usp=sharing)
**Sequencing data summary**
Data yield per sample ranged from 0.423-2.49Gbp.Based on a genome size of 487Mb (486,812,518), samples were sequenced at 0.87X-5.12X depths (except for two high sequencing depth samples: 23.84X,11.6Gbp; 25.90X,12.6Gbp).
```{r seq-data}
df <- read_csv("data/summary_data.csv") %>% mutate(pop_order=site_order()[pop])
df %>%
filter(!(sample_id %in% c("MI-1-4_S10","FI-1-3_S9"))) %>%
ggplot(aes(x=reorder(pop,pop_order),y=total_bases/1e+9,color=pop)) +
geom_boxplot() +
geom_point(color="darkgrey",size=.5) +
scale_color_manual(values = site_colors(),guide=F) +
theme_pubclean()+labs(x="",y="Total base (Gb)")
```
**Figure 1:** Total sequencing data available for all shallow coverage samples by reef.