Improve README

alxsimon · alxsimon · commit de6c4719a0c2 · 2022-09-16T19:45:03.000+02:00
diff --git a/README.md b/README.md
@@ -1,56 +1,35 @@
 # Assembly pipeline for Mytilus genomes
 
+Assembly pipeline from 10x chromium reads from the preprint
+"Three new genome assemblies of blue mussel lineages: North and South European Mytilus edulis and Mediterranean Mytilus galloprovincialis" bioRxiv ([https://doi.org/10.1101/2022.09.02.506387](https://doi.org/10.1101/2022.09.02.506387 )).
+
 [`snakemake`](https://snakemake.readthedocs.io/en/stable/) (in a conda environnement for example) and 
 [`singularity`](https://github.com/hpcng/singularity) need to be installed.
 
+## Supernova storage workarounds
+
+Supernova use large amount of storage for temporary and final results.
+
 The supernova results are stored on a distant NAS that needs to be mounted first on my system.
 ```
 sshfs nas4:/share/sea/sea/projects/ref_genomes/assembly_10x/results/supernova_assemblies \
 results/supernova_assemblies \
 -o idmap=user,compression=no,uid=1000,gid=1000,allow_root
 ```
 
-I also use a 4T disk as a temporary local storage for supernova computation
+I also used a 4T disk as a temporary local storage for supernova computation
 `sudo mount /dev/sd[x]1 /data/ref_genomes/assembly_10x/tmp`
 
-To run use:
-```
-conda activate snake_env
-
-snakemake --use-conda --conda-frontend mamba --conda-prefix .conda \
---use-singularity --singularity-args "-B /nas_sea:/nas_sea" \
--j {threads}
-```
-
-Final versions are *_v6.pseudohap.fasta.gz and they correspond to:
-- mgal_01
-- medu_01
-- mtro_01
 
-Another version of mtro is done, tros_v7, also called mtro_02 which is improved by LRScaf with nanopore reads, scaffolding on the *Mytilus coruscus* reference genome and Pilon corrections.
+## How to run
 
+To run use:
 ```
 conda activate snake_env
 
-snakemake --use-conda --conda-frontend mamba --conda-prefix .conda \
+snakemake --use-conda \
 --use-singularity --singularity-args "-B /nas_sea:/nas_sea" \
--j {threads} mtro_improvement
-```
-
-## Calling for pop check
-
-This part uses another dataset of reference individuals called with angsd.
-For comparison we also call with angsd (especially ANGSD puts major allele as REF in bcf and is therefore incompatible with bcftools call).
-```
-ln -s /data2/myt_popgen/angsd_calling/results/post_analysis/subset.sites resources/angsd_subset.sites
-ln -s /data2/myt_popgen/angsd_calling/results/post_analysis/subset.beagle.gz resources/angsd_ref_subset.beagle.gz
+-j {threads} \
+[either all_v6, asm_improvement, stats, repeats, annotation, finalize or ncbi_submission (see workflow/Snakefile)]
 ```
 
-## Annotation tools to build beforehand
-
-```
-sudo singularity build -F resources/cactus_v1.3.0-gpu.sif \
-docker://quay.io/comparative-genomics-toolkit/cactus:v1.3.0-gpu
-
-sudo singularity build resources/cat.sif docker://quay.io/ucsc_cgl/cat
-```