Hello! I'm a computational biologist passionate about understanding how genomic complexity is translated into functional biology. My primary focus is on RNA biology, particularly alternative splicing, and I am driven to build the computational tools necessary to explore this complex regulatory landscape.
- 🔭 Currently: Undertaking a research internship as a Sanger Prize recipient at the Wellcome Sanger Institute, investigating trans-splicing events.
- 💬 Ask me about: RNA-Seq analysis, building bioinformatics pipelines, or comparative genomics.
- 📫 How to reach me: LinkedIn | ORCiD
Pipeline & Workflow Management:
Bioinformatics Tools:
- Transcriptomics: STAR, Trinity, MAJIQ, SGSeq
- Genomics: Minimap2, BLAST
- Environment: HPC, Conda
Here are a couple of projects I'm particularly proud of:
A reproducible Nextflow pipeline I developed during an intership at Sanger Institute as the Sanger Prize winner (🏆🏆🏆). It helps search for the Sequence Leader (SL) sequences that are attached to an mRNA molecule from another gene within the trans-splicing processing. But, honestly, it helps finding overrepresented sequences at the beginning or end of full-length transcript sequences.
A reproducible Nextflow pipeline and SQL database system I developed for my undergraduate thesis. It is designed to discover and annotate splicing events from public RNA-Seq data, transforming disconnected raw data into a powerful, queryable resource by linking every event to its rich biological metadata.
Tags: Nextflow, Python, SQL, Bioinformatics, RNA-Seq, Splicing
A Python tool to create codon-level sequence logos, offering a more nuanced view of conservation in protein-coding sequences compared to traditional amino acid or nucleotide logos. This work was peer-reviewed and published in MethodsX.
Tags: Python, Bioinformatics, Data Visualization, Sequence Analysis, Publication
