Skip to content

MetaPhlAn 4.2.2 with Jan25 db and generation of the rarefied taxonomic profile #30

@fasnicar

Description

@fasnicar

The command below is an example for the subsampling (to 25M reads, paired-end) profiling with the new MetaPhlAn 4.2.2.
Notice that with this version, some parameter names changed, as well as the default behaviour now to do unclassified estimation (which can be turned off for generating the profile without unclassified). So, we should update the NF pipeline accordingly.

For the subsampling, we need to do a new mapping of the reads, so that in this way MetaPhlAn can generate the subsamples fastq files.
We can also use these fastq files to produce a subsample of HUMAnN profiles.

metaphlan -1 ${fwd} -2 ${rev} \
    --nproc 4 \
    --input_type fastq \
    --db_dir ${mdb_path} \
    --index ${mdb_version} \
    --force \
    --subsampling_paired 25000000 \
    --subsampling_seed 42 \
    --subsampling_output ${path_sub}/${s} \
    --mapout ${ps}/${s}.bowtie2.bz2 \
    --samout ${ps}/${s}.sam.bz2 \  # CHECK THE LONG VERSION
    --output_file ${ps}/${s}_profile.tsv;

# Rename and compress subsampled reads
mv ${path_sub}/${s}.R1 ${path_sub}/${s}_R1.fastq;
mv ${path_sub}/${s}.R2 ${path_sub}/${s}_R2.fastq;
bzip2 ${path_sub}/${s}_R*.fastq;

Reference: https://github.com/biobakery/MetaPhlAn/releases/tag/4.2.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions