Skip to content

request for Biostar398854 - Not called SNPs in lowercase letter in output FASTA #246

@Zwiep2023

Description

@Zwiep2023

Subject of the issue

Not called SNPs in lowercase letter in output FASTA

Your environment

  • version of jvarkit: dbdbed3
  • version of java: openjdk version "17.0.10" 2024-01-16
  • the value of ${JAVA_HOME}
  • Ubuntu 22.04.03 LTS

Steps to reproduce

I wondered whether it would be possible to include an option that indicates in the output FASTA that for a certain SNP that has not been called for a given sample in the vcf (e.g. based on VcfToTable output) a lowercase letter is used, and capital letters are only used for SNPs that have been called/have read coverage. Please find below an example that illustrates my request.

Simplified VcfToTable output for a certain SNP, for 13 samples:

REF A
ALT T
Sample Type AD
Sample_1 NO_CALL 0,0
Sample_2 NO_CALL 0,0
Sample_3 NO_CALL 0,0
Sample_4 HOM_REF 1,0
Sample_5 HOM_REF 2,0
Sample_6 NO_CALL 0,0
Sample_7 HOM_REF 2,0
Sample_8 HOM_REF 3,0
Sample_9 HOM_REF 3,0
Sample_10 HOM_VAR 0,2
Sample_11 HOM_REF 2,0
Sample_12 NO_CALL 0,0
Sample_13NO_CALL 0,0

nucleotide representation in Biostar398854 output FASTA for the same SNP for these 13 samples:

Sample current output of Biostar398854 requested output of Biostar398854
Sample_1 a a
Sample_2 a a
Sample_3 a a
Sample_4 a A
Sample_5 a A
Sample_6 a a
Sample_7 a A
Sample_8 a A
Sample_9 a A
Sample_10 T T
Sample_11 a A
Sample_12 a a
Sample_13 a a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions