data will be placed in data folder with symbolic link
(*when running the fastqc as sbatch job, it return error at some point, stopping the echo "unzipping")
For RNA reads, in Per base sequence content and Per sequence GC content bad behaviors could come from the transtomic are over representative sequence or from pollution, it might be okey when the evaluation is still "worning".
Evaluating the QC report: bad quality, move to using the genome from author.
(https://www.ncbi.nlm.nih.gov/Traces/wgs/?val=NSDW01#contigs *scaffold 11)
Tophat in loop maynot be getting the files path correctly, I created specific batch file for each job.
Solved by running Tophat for each pair of RNA reads in terminal manually.
For trinity, use the BAM file from Tophat directly (merge selection of BAMs files).
download relatedness protein in FASTA format
First step for maker pipeline
QC for trinity assembly?
May need to increase the core to speed up maker2 run.
To define the cores used, it also required to define when running maker:
maker -c 4 -qRun maker step 7 with arabidopsis directly from step 2.
Documentation
Run HTseq:
EggNOGmapper
Finish DEseq, IGV, Biological interpretation.