Skip to content

no parallelize with snakemake #56

@Fadwa7

Description

@Fadwa7

Hi,
I am working on a pipeline that downloads SRR files, and to optimize the download time, I opted for your tool. However, I am encountering an issue. I have a list of 143 files, and when I run this command:

rule fetch_fastq:

output:
    config["RESULTS"] + "Fastq_Files/{sra}.fastq.gz"
log:
    config["RESULTS"] + "Supplementary_Data/Logs/{sra}.sratoolkit.log"
benchmark:
    config["RESULTS"] + "Supplementary_Data/Benchmark/{sra}.sratoolkit.txt"
message:
   "fetch fastq from NCBI"
params:
   conda = "sratoolkit",
   outdir = config["RESULTS"] + "Fastq_Files"
threads: 16
shell:
    """
    set +eu &&
    . $(conda info --base)/etc/profile.d/conda.sh &&
    conda activate {params.conda}
    parallel-fastq-dump \
        --outdir {params.outdir} \
        --gzip \
        --sra-id {wildcards.sra} \
        --threads {threads}
    """

`

And I launch Snakemake with snakemake -s snakefile --cores 4, it processes all files in batches of 4 until it finishes executing the first rule, then it moves to the second rule. However, I want it to execute all rules in the Snakefile on the first 4 files, then move to the next 4 files, and so on.

Do you have any solutions? Thank you in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions