GitHub - cyntsc/meta-analisis-athal: Respaldo del proyecto de meta-analsis en datos RNASeq para la planta arabidopsis infectada por hongo

Here is the metatranscriptomic analysis done to identify genetic patterns related with the consensus response in Arabidopsis under diverse biotic stressors caused by fungi

Experimental Design

The experiment was designed in 2 blocks. First block with 8 transcriptomes from healthy plants arranged in 4 groups for control (healthy12, healthy18, healthy24 and healthy30), and Second block with 17 transcriptomes from infected plants in interaction with three Ascomycete fungi (B=Botrytis cinerea, Ch=Colletotrichum higginsianum, and Ss=Sclerotinia sclerotiorum) arranged in 5 groups for the treatments (Bc12, Bc18, Bc24, Ch22, Ch40 and Ss30) --the digits correspond to the time of inoculation (hpi) with the fungus.
The treatments represent 68% of the included samples (32% for C higginsianum, 24% for B cinerea and 12% for S sclerotiorum), the remaining 32% corresponds to the controls.
RNA was extracted from leaf among a range from 0 to 48 hpi
Just transcriptomes sequenced in Illumina platforms with a lenght.seq>100 & reads>5G were considered.

Material & Methods

Data are bulk RNASeq libraries downloaded from the SRA-NCBI:
ID Bioprojects: PRJNA148307 (SRR364389, SRR364390, SRR364391, SRR364392, SRR364400, SRR364401, SRR364398 and SRR364399) for arabidopsis infected with Colletotrichum higginsianum at 22 and 40 hpi, PRJNA315516 (SRR3383696, SRR3383697, SRR3383779 and SRR3383780) arabidopsis infected with Botrytis cinerea at 12 and 18 hpi, PRJNA593073 (SRR10586397 and SRR10586399) with Botrytis cinerea at 24 hpi, and PRJNA418121 (SRR6283146, SRR6283147 and SRR6283148) arabidopsis infected with Sclerotinia sclerotiorum at 30 hpi. The arabidopsis healthy RNASeq libraries were downloaded from the same repository under the ID Bioprojects: PRJNA315516 (SRR3383640 and SRR3383641) mock treatment at 12hr, (SRR3383782 and SRR3383783) mock treatment at 18 hr and (SRR3383821 and SRR3383822) mock treatment at 30hr, and PRJNA418121 (SRR6283144 and SRR6283145) mock treatment at 30hr.
For the RNASeq raw-counts, I followed the alignment to genetic reference approach. The GENOME TAIR10 (GenBank accessions CP002684 – CP002688) and the ARAPORTt11 annotation were used, with a target of 27655 Protein-Coding-Sequence (CDS).
A Clustering (ML not-supervised) approach was used, made with WGCNA. Signed-networks were built with the Pearson method. A threshold 𝛃=0.80 for signed-ntw & merged at 0.1 euclidean distances was set. Genetic-Modules with corr ≷ 0.75 were extracted.
perks
Genetic-Modules on infected plants of interest were identified through logical operations, extracting modules differentiated between 100 and 77% from the healthy plants.

Folder content:

HTC_scripts_biotools: scripts to get the raw-counts.
R_scripts_WGCNA: scripts in R to build the genetic ntws with WGCNA.
grep_utilities: grep build-in commands to extract data from files (g.e:alignment stats).
meta-data: mostly files to link external data to the genetic ntws.
notebooks: phython 3+ scritps to build the expression matrices.
results-data: statistical results, ntw results, intermedian results, all that can be considered a product comming from a process.

Supplementary material

WGCNA Coexpression Ntw(s) were built with the help of several online public resources. Here, I share you some links that hopefully can help you in your own project:
Please be aware you need to adjust code to your own needs
https://horvath.genetics.ucla.edu/html/CoexpressionNetwork/Rpackages/WGCNA/
http://pklab.med.harvard.edu/scw2014/WGCNA.html
https://wikis.utexas.edu/display/bioiteam/Clustering+using+WGCNA
https://www.polarmicrobes.org/weighted-gene-correlation-network-analysis-wgcna-applied-to-microbial-communities/

If you just want to have a glance at the outputs of the main scripts of this project, you can access this information by following the links.

Script to build the expression matrices

2_Matrix_A_integrate_raw_countsb
https://nbviewer.jupyter.org/github/cyntsc/meta-analisis-athal/blob/master/notebooks/2_Matrix_A_integrate_raw_counts.ipynb

3_Matrix_B_TPM_normalization
https://nbviewer.jupyter.org/github/cyntsc/meta-analisis-athal/blob/master/notebooks/3_Matrix_B_TPM_normalization.ipynb

4_Matrix_C_TPM_standardization
https://nbviewer.jupyter.org/github/cyntsc/meta-analisis-athal/blob/master/notebooks/4_Matrix_C_TPM_standardization.ipynb

5_Matrix_D_E_Log2_Atypicals
https://nbviewer.jupyter.org/github/cyntsc/meta-analisis-athal/blob/master/notebooks/5_Matrix_D_E_Log2_Atypicals.ipynb

Some interactive HTML files about the expression matrices built can be downloaded from the results-data/matrices_de_expresion/ folder: (just download the file in your local PC and open it.
Interactive stats for ARABIDOPSIS HEALTHY MatrixD.html
Interactive stats for ARABIDOPSIS INFECTED MatrixE.html
Interactive stats for comparition in ARABIDOPSIS INFECTED MatrixD and E.html
More stats...

Script to get the genetic modules in the expression matrices

WGCNA scripts: WGCNA scripts for Signed-Ntw (pearson)

WGCNA results: Genetic Modules gotten with WGCNA Signed-Ntw (pearson)
Preferred genetic modules results are in folders:
Athal_healthy_mods_merged_MatrixD for A thaliana Healthy
Athal_infected_mods_merged_MatrixE for A thaliana Infected

Be happy and Enjoy!
Cynthia SC

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
HTC_scripts_biotools		HTC_scripts_biotools
WGCNA_RScripts		WGCNA_RScripts
grep_utilities		grep_utilities
images		images
meta-data		meta-data
notebooks		notebooks
results-data		results-data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Here is the metatranscriptomic analysis done to identify genetic patterns related with the consensus response in Arabidopsis under diverse biotic stressors caused by fungi

If you just want to have a glance at the outputs of the main scripts of this project, you can access this information by following the links.

Script to build the expression matrices

Script to get the genetic modules in the expression matrices

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Here is the metatranscriptomic analysis done to identify genetic patterns related with the consensus response in Arabidopsis under diverse biotic stressors caused by fungi

If you just want to have a glance at the outputs of the main scripts of this project, you can access this information by following the links.

Script to build the expression matrices

Script to get the genetic modules in the expression matrices

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages