Skip to content

RPGC normalization (--normalizeUsing RPGC) is not supported with bamCompare! #1404

@ps-puneetsharma

Description

@ps-puneetsharma

Dear Devon,

Thank you for developing and maintaining this wonderful tool.

A minor thing, while running bamCompare on deeptools 3.5.5:

bamCompare \
    --numberOfProcessors "$THREADS" \
    --blackListFileName "$gen_in"/hg38_blacklist_v2.bed \
    --effectiveGenomeSize 2913022398 \
    --operation log2 \
    --binSize 1 \
    --normalizeUsing RPGC \
    --extendReads 150 \
    --outFileFormat bigwig \
    --bamfile1 "$filename" \
    --bamfile2 "$dir_in"/${base}_chip_input_rep${rep}_macs3_nomt_unique_genome_sorted.bam \
    --outFileName "$dir_in"/${base}_chip_h3k27me3_rep${rep}_log2ratio.bigwig \
    2> "$dir_in"/${base}_chip_h3k27me3_rep${rep}_log2ratio_log.txt

I get the error:

RPGC normalization (--normalizeUsing RPGC) is not supported with bamCompare!

The documentation for version 3.5.6 mentions:


Optionally scaling can be turned off and individual samples normalized using the RPKM, BPM or CPM methods (or no normalization at all)

But, also:


--normalizeUsing

    Possible choices: RPKM, CPM, BPM, RPGC, None

    Use one of the entered methods to normalize the number of reads per bin. By default, no normalization is performed. RPKM = Reads Per Kilobase per Million mapped reads; CPM = Counts Per Million mapped reads, same as CPM in RNA-seq; BPM = Bins Per Million mapped reads, same as TPM in RNA-seq; RPGC = reads per genomic content (1x normalization); Mapped reads are considered after blacklist filtering (if applied). RPKM (per bin) = number of reads per bin / (number of mapped reads (in millions) * bin length (kb)). CPM (per bin) = number of reads per bin / number of mapped reads (in millions). BPM (per bin) = number of reads per bin / sum of all reads per bin (in millions). RPGC (per bin) = number of reads per bin / scaling factor for 1x average coverage. None = the default and equivalent to not setting this option at all. This scaling factor, in turn, is determined from the sequencing depth: (total number of mapped reads * fragment length) / effective genome size. The scaling factor used is the inverse of the sequencing depth computed for the sample to match the 1x coverage. This option requires –effectiveGenomeSize. Each read is considered independently, if you want to only count one mate from a pair in paired-end data, then use the –samFlagInclude/–samFlagExclude options. (Default: None)


The later part needs to be updated to reflect that RPGC normalization is not possible in bamCompare.

Best regards,
Puneet

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions