🧬NGS WES/TRS Analysis

This repository provides a comprehensive Next-Generation Sequencing (NGS) workflow designed for Whole Exome Sequencing (WES) and Targeted Region Sequencing (TRS) analysis — from raw FASTQ files to annotated variant calls.

📖 Overview

The project explains and demonstrates:

Differences between WGS, WES, and TRS
Metadata identification in NCBI SRA
Quality Control and read pre-processing
Alignment, variant calling, and annotation steps
Handling target region (BED) files
Clinical interpretation of VCF outputs

🧩 Tools Used

Step	Tool	Purpose
Raw QC	FastQC	Assess sequencing quality
Trimming	Trim Galore	Remove adapters & low-quality bases
Alignment	BWA	Map reads to reference genome
BAM Processing	SAMtools	Sort, index, and manage alignments
Variant Calling	GATK	Detect SNPs and indels
Annotation	SnpEff	Predict functional and clinical impact

🧠 Workflow Summary

Raw Data QC → FastQC
Trimming & Filtering → Trim Galore
Alignment → BWA
Post-processing → SAMtools (sort, index, mark duplicates)
Variant Calling → GATK (HaplotypeCaller)
Annotation → SnpEff
Interpretation → Prioritize variants within target regions (using BED files)

💡 Key Concepts

WGS vs WES vs TRS – Understanding scope, depth, and cost
Target BED Files – Define genomic intervals for exome/panel analysis
Coverage Analysis – Assess sequencing depth and breadth
Clinical Variant Reporting – Use ClinVar, OMIM, COSMIC references

🧰 Skills Highlighted

NGS Data Pre-processing
Variant Detection and Annotation
Workflow Optimization for Exome/Panel Data
Familiarity with GATK Best Practices
Genomic Data Interpretation

📑 Project Report

1️⃣ Introduction

Next-Generation Sequencing (NGS) has transformed genomics by enabling fast, cost-effective sequencing of DNA and RNA. Among NGS approaches, Whole Exome Sequencing (WES) and Targeted Region Sequencing (TRS) are powerful methods for identifying variants within coding or specific genomic regions.
This project focuses on the bioinformatics workflow for WES/TRS, outlining key computational steps and tools used for transforming raw sequencing reads into biologically meaningful variant annotations.

2️⃣ Objectives

To perform quality assessment and preprocessing of NGS reads.
To align reads to a reference genome and generate BAM files.
To identify variants (SNPs and indels) using GATK and SAMtools.
To annotate variants using SnpEff and interpret their biological significance.
To compare workflows and outputs for WGS, WES, and TRS data.

3️⃣ Materials and Methods

Tools and Software Used

Step	Tool	Description
Quality Check	FastQC	Evaluate raw sequence quality
Trimming	Trim Galore	Remove low-quality bases and adapters
Alignment	BWA	Map reads to reference genome
Post-Processing	SAMtools	Sort, index, and manage BAM files
Variant Calling	GATK HaplotypeCaller	Identify SNPs and indels
Annotation	SnpEff	Predict variant effects on genes/proteins

Workflow Overview

Input Data: Raw paired-end FASTQ files
Quality Control: Evaluate read quality using FastQC
Read Trimming: Remove adapters and low-quality sequences
Alignment: Map reads to reference genome using BWA
Sorting & Indexing: Organize and index BAM files
Variant Calling: Detect genomic variants using GATK
Annotation: Use SnpEff to classify and interpret variant impacts

4️⃣ Results and Discussion

High-quality reads were confirmed using FastQC before and after trimming.
Alignment achieved high mapping efficiency with minimal read loss.
Variant calling identified SNPs and small indels within targeted regions.
SnpEff annotation provided functional classification of variants (e.g., synonymous, missense, nonsense).
The workflow effectively demonstrates how bioinformatics pipelines translate raw NGS data into meaningful genomic insights.

5️⃣ Conclusion

This project successfully demonstrates a comprehensive NGS workflow for WES and TRS analysis.
The integration of multiple open-source tools ensures reliable variant discovery, annotation, and biological interpretation.
Such workflows are essential for genomic research, clinical diagnostics, and precision medicine.

📄 Project Report

You can view the complete project report here:https://github.com/DEBOPRIYA2320/NGS-WES-TRS-Analysis/blob/ac8d517bd6ba9b9dabdf7e66be21c139fe3b2c38/NGS%20-%20Genomics%20data%20analysis%20(Debopriya).pdf

📫 Contact

Author: Debopriya
📧 [email protected]
🔗 GitHub
🔗 LinkedIn

🧾 License

This project is licensed under the MIT License.

⭐ If you found this project helpful, please give it a star on GitHub!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
NGS - Genomics data analysis (Debopriya).pdf		NGS - Genomics data analysis (Debopriya).pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬NGS WES/TRS Analysis

📖 Overview

🧩 Tools Used

🧠 Workflow Summary

💡 Key Concepts

🧰 Skills Highlighted

📑 Project Report

1️⃣ Introduction

2️⃣ Objectives

3️⃣ Materials and Methods

Tools and Software Used

Workflow Overview

4️⃣ Results and Discussion

5️⃣ Conclusion

📄 Project Report

📫 Contact

🧾 License

About

Uh oh!

Releases

Packages

License

DEBOPRIYA2320/NGS-WES-TRS-Analysis

Folders and files

Latest commit

History

Repository files navigation

🧬NGS WES/TRS Analysis

📖 Overview

🧩 Tools Used

🧠 Workflow Summary

💡 Key Concepts

🧰 Skills Highlighted

📑 Project Report

1️⃣ Introduction

2️⃣ Objectives

3️⃣ Materials and Methods

Tools and Software Used

Workflow Overview

4️⃣ Results and Discussion

5️⃣ Conclusion

📄 Project Report

📫 Contact

🧾 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages