Skip to content

Questions about multi-batch long reads processing, VCF FILTER field interpretation and 1000 Genomes BAM usage in TRGT #93

@CaryStar01

Description

@CaryStar01

Dear TRGT Development Team,
We are writing to consult three questions regarding genotyping analysis with long reads data:

  1. For long reads data of a specific sample from projects like the 1000 Genomes Project (see the attached figure), does such data typically correspond to multiple sequencing batches? When dealing with multi-batch fastq data for genotyping using TRGT, what is the standard workflow? Should we merge all batches into a single fastq file, perform alignment first, and then conduct genotyping, or is it acceptable to directly perform genotyping with data from only one batch?
Image
  1. Regarding the filter field in the VCF file generated by TRGT: ##FILTER=<ID=PASS,Description="All filters passed">. Does this field indicate that the corresponding genotyping results have passed all quality filter criteria and can be directly used for subsequent analyses? We would also like to supplement our practical confusion: in the VCF files we generated, the value of the FILTER field for all genotyped variants is . instead of PASS. We wonder if this situation means that the data quality of these genotyped variants is substandard and the results are unreliable?
Image
  1. We also have an additional question: In relevant research papers, when analyzing data from the 1000 Genomes Project, we noticed a phenomenon—for short reads data, researchers generally directly use the BAM alignment files provided on the project's official website; however, for long reads data, many researchers choose to abandon the existing official files and instead re-perform sequence alignment to generate new BAM files. If your team has any insights into this, could you please share the reasons behind it?
Image

Thank you for your time and help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions