GoekeLab
diff --git a/‎README.md‎
Lines changed: 16 additions & 5 deletions b/‎README.md‎
Lines changed: 16 additions & 5 deletions
diff --git a/‎docs/AWS_README‎
Lines changed: 1 addition & 1 deletion b/‎docs/AWS_README‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/AWS_RELEASE_NOTE‎
Lines changed: 9 additions & 0 deletions b/‎docs/AWS_RELEASE_NOTE‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎docs/AWS_data_access_tutorial.md‎
100644100755
Lines changed: 20 additions & 10 deletions b/‎docs/AWS_data_access_tutorial.md‎
100644100755
Lines changed: 20 additions & 10 deletions
diff --git a/‎docs/SG-NEx_Bambu_tutorial.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/SG-NEx_Bambu_tutorial.md‎
Lines changed: 2 additions & 2 deletions
@@ -32,7 +32,7 @@ https://groups.google.com/forum/#!forum/sg-nex-updates/join
 
 ## Data Release and Access
 
-**Latest Release (v0.3)**
+**Latest Release (v0.4)**
 
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5574654.svg)](https://doi.org/10.5281/zenodo.5574654)
 
@@ -43,6 +43,7 @@ This release includes 86 samples from 11 different cell lines.
 You can access the following data through the [AWS Open Data Registry](https://registry.opendata.aws/sgnex/):
 
 - raw files (fast5)
+- raw files (blow5)
 - basecalled files (fastq)
 - aligned reads (genome and transcriptome) (bam)
 - tracks for visualisation (bigwig and bigbed)
@@ -51,14 +52,24 @@ You can access the following data through the [AWS Open Data Registry](https://r
 - annotation files
 - detailed sample and experiment information
 
-You can browse the S3 data [here](http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/). 
+You can browse the S3 data here: 1) [fast5, fastq, and bam](http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/) and 2) [blow5](http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/). 
 
 Please refer to the [data access tutorial](docs/AWS_data_access_tutorial.md) which describes the S3 data structure and how to access files with [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/s3/). The direct links to the data are listed in the [sample spreadsheet](docs/samples.tsv).
 
 _**Citation**_: Please cite the pre-print describing the SG-NEx data resource when using these data, and add the following details: "The SG-NEx data was accessed on [DATE] at registry.opendata.aws/sg-nex-data".
 
 Chen, Y. _et al._ "A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines." _bioRxiv_ (2021). doi: https://doi.org/10.1101/2021.04.21.440736
 
+**Release Note**
+
+Version Number: V0.4.0                
+Date: 2023-03-06                          
+Update of the SG-NEx data on AWS. Includes raw signal data in blow5 format. 
+
+Version Number: V0.3.0               
+Date: 2022-07-28                 
+Initial release of the SG-NEx data on AWS. Includes Nanopore direct RNA, cDNA, direct cDNA-Seq, short read RNA-Seq and m6ACE-Seq.
+
 **Release History**
 
 You can find previous releases here in the [release history](https://github.com/GoekeLab/sg-nex-data/releases)
@@ -89,6 +100,8 @@ The following short tutorials are available that demonstrate how to analyse the
 
 - [Identification of m6A with the SG-NEx samples (using m6Anet)](./docs/SG-NEx_m6Anet_tutorial.md)
 
+- [Basecalling and analysing SG-NEx samples in S/BLOW5 format](./docs/SG-NEx_blow5_tutorial.md)
+
 Additional, more detailed workflows can be found here:
 
 - [Transcript discovery, quantification, and differential transcript expression from long read RNA-Seq data (using Bambu)](https://github.com/GoekeLab/bambu)
@@ -99,16 +112,14 @@ Additional, more detailed workflows can be found here:
 
 
 ## Contributors
-
 **GIS Sequencing Platform and Data Generation**            
 Hwee Meng Low, Yao Fei, Sarah Ng, Wendy Soon, CC Khor   
 
 **Cancer Genomics and RNA Modifications**            
 Viktoriia Iakovleva, Puay Leng Lee, Lixia Xin, Hui En Vanessa Ng, Jia Min Loo, Xuewen Ong, Hui Qi Amanda Ng, Suk Yeah Polly Poon, Hoang-Dai Tran, Kok Hao Edwin Lim, Huck Hui Ng, Boon Ooi Patrick Tan, Huck-Hui Ng, N.Gopalakrishna Iyer, Wai Leong Tam, Wee Joo Chng, Leilei Chen, Ramanuj DasGupta, Yun Shen Winston Chan, Qiang Yu, Torsten Wüstefeld, Wee Siong Sho Goh
 
 **Statistical Modeling and Data Analytics**                     
-Ying Chen, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Min Hao Ling, Yu Song Chuah, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke
-
+Ying Chen, Hasindu Gamaarachchi, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Min Hao Ling, Yu Song Chuah, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke
 ## Citing the SG-NEx project
 
 The SG-NEx resource is described in:
 
@@ -10,5 +10,5 @@ The SG-NEx data set is documented here: https://github.com/GoekeLab/sg-nex-data
 
 The folder structure and data access tutorial is described here: https://github.com/GoekeLab/sg-nex-data/blob/master/docs/AWS_data_access_tutorial.md
 
-The data browser link is here: http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/
+The data browser link is here: http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/ and http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/
 
@@ -1,5 +1,14 @@
 The Singapore Nanopore Expression (SG-NEx) Project: Release updates on AWS S3 open data 
 
+Version Number: V0.4.0              
+Date: 2023-03-06
+By: Ying Chen, Genome Insitute of Singapore
+Update of the SG-NEx data on AWS. Includes raw signal data in blow5 format. Please refer to https://github.com/GoekeLab/sg-nex-data for a detailed documentation.
+
+Data access tutorial: https://github.com/GoekeLab/sg-nex-data/blob/master/docs/AWS_data_access_tutorial.md
+Data browser link: http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/
+Contact and questions: https://github.com/GoekeLab/sg-nex-data/discussions
+
 
 Version Number: V0.3.0               
 Date: 2022-07-28
 
@@ -4,15 +4,18 @@ SG-NEx data source contains long read (Oxford Nanopore) RNA sequencing data for
 
 The SG-NEx S3 bucket contains the following types of data:
 
-   - [Raw sequencing signal (fast5)](#raw-sequencing-signal)            
-   - [Basecalled sequences (fastq)](#basecalled-sequences)            
-   - [Aligned sequences (bam)](#aligned-sequences)     
-   - [Data visualisation tracks (bigwig/bigbed)](#data-visualisation-tracks)        
-   - [Annotations](#annotations)            
-   - [Processed data for RNA modification detection](#processed-data)     
-   - [Sample and experiment information](#sample-and-experimental-data)               
+   - [Raw sequencing signal (fast5)](#raw-sequencing-signal)
+   - [Basecalled sequences (fastq)](#basecalled-sequences)
+   - [Aligned sequences (bam)](#aligned-sequences)
+   - [Data visualisation tracks (bigwig/bigbed)](#data-visualisation-tracks)
+   - [Annotations](#annotations)
+   - [Processed data for RNA modification detection](#processed-data)
+   - [Sample and experiment information](#sample-and-experimental-data)
 
- Below is the folder index for the open data bucket:
+The SG-NEx S3 BLOW5 bucket contains the following types of data:
+   - [Raw sequencing signal (blow5)](#raw-sequencing-signal-in-blow5-format)
+
+Below is the folder index for the open data buckets:
 
 ![folder indexing\!](/images/folder_index.png)
 
@@ -24,6 +27,14 @@ aws s3 ls --no-sign-request s3://sg-nex-data/data/sequencing_data_ont/fast5/ # l
 aws s3 sync --no-sign-request s3://sg-nex-data/data/sequencing_data_ont/fast5/sample_name .    # download fast5 files to your local directory
 ```
 
+# Raw sequencing signal in BLOW5 format
+To access raw sequencing (blow5) files:
+
+```bash
+aws s3 ls --no-sign-request s3://sg-nex-data-blow5/ # list samples 
+aws s3 sync --no-sign-request s3://sg-nex-data-blow5/sample_name .    # download blow5 file and the index to your local directory
+```
+
 # Basecalled sequences
 To access basecalled sequencing (fastq) files:
 
@@ -90,7 +101,6 @@ aws s3 sync --no-sign-request s3://sg-nex-data/data/annotations/gtf_file .  # do
 
 ## RNA modification detection
  Long read direct RNA sequencing has allows the detection of RNA modification with RNA modification tools, such as [xPore](https://github.com/GoekeLab/xpore) and [m6Anet](https://github.com/GoekeLab/m6anet). To simplify the analysis of RNA modifications using the SG-Nex datasets, you can download the processed files to use with xPore and m6Anet. 
- 
  To download the processed data for differential RNA modification analysis with xPore:
  ```bash
 aws s3 ls --no-sign-request s3://sg-nex-data/data/processed_data/xpore/  # list all samples that have processed data for RNA modification detection using xPore
@@ -106,7 +116,7 @@ These files are provided for a subset of samples, please see [here](/docs/sample
 
 # Sample and experimental data 
 
-Detailed information for each sequencing sample is provided [here](/docs/samples.tsv). The data also includes multiplexed samples which share the same fast5 files. The information about the multiplexed samples can be found [here](/docs/multiplexed_samples.tsv). The files can also be accessed directly on S3:
+Detailed information for each sequencing sample is provided [here](/docs/samples.tsv). The data also includes multiplexed samples which share the same fast5/blow5 files. The information about the multiplexed samples can be found [here](/docs/multiplexed_samples.tsv). The files can also be accessed directly on S3:
 
 
  ```bash
 
@@ -46,7 +46,7 @@ Next, we will need to download the required data to run Bambu. The required data
 
 Generally, you may want to learn how to get access to these data using the [data
 access
-tutorial](https://github.com/GoekeLab/sg-nex-data/blob/updated-documentation/docs/AWS_data_access_tutorial.md). Below we only show the necessary steps to download the required data. The following command requires you to have [AWS CLI](https://aws.amazon.com/cli/) installed.
+tutorial](AWS_data_access_tutorial.md). Below we only show the necessary steps to download the required data. The following command requires you to have [AWS CLI](https://aws.amazon.com/cli/) installed.
 
 ``` bash
 # create a directory to store the data
@@ -68,7 +68,7 @@ aws s3 sync --no-sign-request s3://sg-nex-data/data/data_tutorial/bam ./bambu_tu
 You may also download the required data directly from the [SG-NEx AWS S3
 bucket](http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/) if you are unfamiliar with AWS CLI command. They are stored in the `data/data_tutorial/bam` folder.
 
-**NOTE: We have downsampled the Hg38 genome, A549 and HepG2 samples to ensure this tutorial can be completed in 10 minutes. If you want to run Bambu on the original samples, you can find the sample name [here](https://github.com/GoekeLab/sg-nex-data/blob/updated-documentation/docs/samples.tsv) and amend it into the following code chunk:**
+**NOTE: We have downsampled the Hg38 genome, A549 and HepG2 samples to ensure this tutorial can be completed in 10 minutes. If you want to run Bambu on the original samples, you can find the sample name [here](samples.tsv) and amend it into the following code chunk:**
 
 ```bash
 # Note: Please make sure to replace the "sample_alias" with your sample name
Original file line number	Diff line number	Diff line change
`@@ -10,5 +10,5 @@ The SG-NEx data set is documented here: https://github.com/GoekeLab/sg-nex-data`
`10`	`10`
`11`	`11`	`The folder structure and data access tutorial is described here: https://github.com/GoekeLab/sg-nex-data/blob/master/docs/AWS_data_access_tutorial.md`
`12`	`12`
`13`		`-The data browser link is here: http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/`
	`13`	`+The data browser link is here: http://sg-nex-data.s3-website-ap-southeast-1.amazonaws.com/ and http://sg-nex-data-blow5.s3-website-ap-southeast-1.amazonaws.com/`
`14`	`14`