You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -43,6 +43,7 @@ This release includes 86 samples from 11 different cell lines.
43
43
You can access the following data through the [AWS Open Data Registry](https://registry.opendata.aws/sgnex/):
44
44
45
45
- raw files (fast5)
46
+
- raw files (blow5)
46
47
- basecalled files (fastq)
47
48
- aligned reads (genome and transcriptome) (bam)
48
49
- tracks for visualisation (bigwig and bigbed)
@@ -51,14 +52,24 @@ You can access the following data through the [AWS Open Data Registry](https://r
51
52
- annotation files
52
53
- detailed sample and experiment information
53
54
54
-
You can browse the S3 data [here](http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/).
55
+
You can browse the S3 data here: 1) [fast5, fastq, and bam](http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/) and 2) [blow5](http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/).
55
56
56
57
Please refer to the [data access tutorial](docs/AWS_data_access_tutorial.md) which describes the S3 data structure and how to access files with [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/s3/). The direct links to the data are listed in the [sample spreadsheet](docs/samples.tsv).
57
58
58
59
_**Citation**_: Please cite the pre-print describing the SG-NEx data resource when using these data, and add the following details: "The SG-NEx data was accessed on [DATE] at registry.opendata.aws/sg-nex-data".
59
60
60
61
Chen, Y. _et al._ "A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines." _bioRxiv_ (2021). doi: https://doi.org/10.1101/2021.04.21.440736
61
62
63
+
**Release Note**
64
+
65
+
Version Number: V0.4.0
66
+
Date: 2023-03-06
67
+
Update of the SG-NEx data on AWS. Includes raw signal data in blow5 format.
68
+
69
+
Version Number: V0.3.0
70
+
Date: 2022-07-28
71
+
Initial release of the SG-NEx data on AWS. Includes Nanopore direct RNA, cDNA, direct cDNA-Seq, short read RNA-Seq and m6ACE-Seq.
72
+
62
73
**Release History**
63
74
64
75
You can find previous releases here in the [release history](https://github.com/GoekeLab/sg-nex-data/releases)
@@ -89,6 +100,8 @@ The following short tutorials are available that demonstrate how to analyse the
89
100
90
101
-[Identification of m6A with the SG-NEx samples (using m6Anet)](./docs/SG-NEx_m6Anet_tutorial.md)
91
102
103
+
-[Basecalling and analysing SG-NEx samples in S/BLOW5 format](./docs/SG-NEx_blow5_tutorial.md)
104
+
92
105
Additional, more detailed workflows can be found here:
93
106
94
107
-[Transcript discovery, quantification, and differential transcript expression from long read RNA-Seq data (using Bambu)](https://github.com/GoekeLab/bambu)
@@ -99,16 +112,14 @@ Additional, more detailed workflows can be found here:
99
112
100
113
101
114
## Contributors
102
-
103
115
**GIS Sequencing Platform and Data Generation**
104
116
Hwee Meng Low, Yao Fei, Sarah Ng, Wendy Soon, CC Khor
Ying Chen, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Min Hao Ling, Yu Song Chuah, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke
111
-
122
+
Ying Chen, Hasindu Gamaarachchi, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Min Hao Ling, Yu Song Chuah, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke
Copy file name to clipboardExpand all lines: docs/AWS_README
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -10,5 +10,5 @@ The SG-NEx data set is documented here: https://github.com/GoekeLab/sg-nex-data
10
10
11
11
The folder structure and data access tutorial is described here: https://github.com/GoekeLab/sg-nex-data/blob/master/docs/AWS_data_access_tutorial.md
12
12
13
-
The data browser link is here: http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/
13
+
The data browser link is here: http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/ and http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/
Copy file name to clipboardExpand all lines: docs/AWS_RELEASE_NOTE
+9Lines changed: 9 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,14 @@
1
1
The Singapore Nanopore Expression (SG-NEx) Project: Release updates on AWS S3 open data
2
2
3
+
Version Number: V0.4.0
4
+
Date: 2023-03-06
5
+
By: Ying Chen, Genome Insitute of Singapore
6
+
Update of the SG-NEx data on AWS. Includes raw signal data in blow5 format. Please refer to https://github.com/GoekeLab/sg-nex-data for a detailed documentation.
7
+
8
+
Data access tutorial: https://github.com/GoekeLab/sg-nex-data/blob/master/docs/AWS_data_access_tutorial.md
9
+
Data browser link: http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/
10
+
Contact and questions: https://github.com/GoekeLab/sg-nex-data/discussions
Long read direct RNA sequencing has allows the detection of RNA modification with RNA modification tools, such as [xPore](https://github.com/GoekeLab/xpore) and [m6Anet](https://github.com/GoekeLab/m6anet). To simplify the analysis of RNA modifications using the SG-Nex datasets, you can download the processed files to use with xPore and m6Anet.
93
-
94
104
To download the processed data for differential RNA modification analysis with xPore:
95
105
```bash
96
106
aws s3 ls --no-sign-request s3://sg-nex-data/data/processed_data/xpore/ # list all samples that have processed data for RNA modification detection using xPore
@@ -106,7 +116,7 @@ These files are provided for a subset of samples, please see [here](/docs/sample
106
116
107
117
# Sample and experimental data
108
118
109
-
Detailed information for each sequencing sample is provided [here](/docs/samples.tsv). The data also includes multiplexed samples which share the same fast5 files. The information about the multiplexed samples can be found [here](/docs/multiplexed_samples.tsv). The files can also be accessed directly on S3:
119
+
Detailed information for each sequencing sample is provided [here](/docs/samples.tsv). The data also includes multiplexed samples which share the same fast5/blow5 files. The information about the multiplexed samples can be found [here](/docs/multiplexed_samples.tsv). The files can also be accessed directly on S3:
Copy file name to clipboardExpand all lines: docs/SG-NEx_Bambu_tutorial.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,7 +46,7 @@ Next, we will need to download the required data to run Bambu. The required data
46
46
47
47
Generally, you may want to learn how to get access to these data using the [data
48
48
access
49
-
tutorial](https://github.com/GoekeLab/sg-nex-data/blob/updated-documentation/docs/AWS_data_access_tutorial.md). Below we only show the necessary steps to download the required data. The following command requires you to have [AWS CLI](https://aws.amazon.com/cli/) installed.
49
+
tutorial](AWS_data_access_tutorial.md). Below we only show the necessary steps to download the required data. The following command requires you to have [AWS CLI](https://aws.amazon.com/cli/) installed.
You may also download the required data directly from the [SG-NEx AWS S3
69
69
bucket](http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/) if you are unfamiliar with AWS CLI command. They are stored in the `data/data_tutorial/bam` folder.
70
70
71
-
**NOTE: We have downsampled the Hg38 genome, A549 and HepG2 samples to ensure this tutorial can be completed in 10 minutes. If you want to run Bambu on the original samples, you can find the sample name [here](https://github.com/GoekeLab/sg-nex-data/blob/updated-documentation/docs/samples.tsv) and amend it into the following code chunk:**
71
+
**NOTE: We have downsampled the Hg38 genome, A549 and HepG2 samples to ensure this tutorial can be completed in 10 minutes. If you want to run Bambu on the original samples, you can find the sample name [here](samples.tsv) and amend it into the following code chunk:**
72
72
73
73
```bash
74
74
# Note: Please make sure to replace the "sample_alias" with your sample name
0 commit comments