Computational biologist focused on cancer transcriptomics, tumor microenvironment biology, and biomarker discovery using bulk and single-cell RNA-seq data.
Research interests include tumor immune evasion, survival modeling, and development of reproducible genomics workflows for clinically relevant insights.
Genomic data → RNA-seq processing → Differential expression → Functional pathway analysis → Survival modeling → Prognostic biomarker discovery → Clinical interpretation
https://github.com/ag48665/lusc-transcriptomic-prognostic-signature
Objective: Develop and validate a robust survival prediction model for LUSC patients.
Data: TCGA (training), GEO (external validation: GSE30219, GSE37745)
Methods:
- Cox proportional hazards modeling
- Elastic-net feature selection
- Kaplan–Meier survival analysis
- Time-dependent ROC analysis
- Multivariable Cox regression
- Calibration and decision curve analysis
https://github.com/ag48665/tcga-lusc-biomarker-analysis
Objective: Identify survival-associated gene expression programs in LUSC.
Methods:
- TCGA data acquisition (TCGAbiolinks)
- Differential expression analysis (DESeq2)
- Functional enrichment (GO / KEGG)
- Survival analysis
https://github.com/ag48665/lusc-immune-escape-analysis
Objective: Transcriptomic analysis of immune heterogeneity in lung squamous cell carcinoma (LUSC), focusing on immune activation, checkpoint signaling, and exhaustion-associated tumor states using TCGA and GEO datasets.
- Immune gene signature scoring
- T-cell exhaustion profiling
- Checkpoint signaling analysis
- UMAP visualization
- Survival analysis
- External cohort validation
https://github.com/ag48665/tcga-lung-immune-evasion-scRNAseq
Objective: Explore immune cell populations and functional states in the tumor microenvironment at single-cell resolution.
Methods:
- Scanpy preprocessing and normalization
- PCA / UMAP embedding
- Clustering and cell-type annotation
https://github.com/ag48665/scrna-pbmc-cell-atlas
Objective: Reconstruct immune cell populations from human PBMC single-cell RNA-seq data using an unsupervised Scanpy workflow.
Data: Public PBMC3K dataset from Scanpy (~2,700 human peripheral blood mononuclear cells)
Methods:
- Single-cell RNA-seq quality control and filtering
- Normalization and highly variable gene selection
- PCA / UMAP dimensionality reduction
- Leiden clustering
- Marker gene identification
- Cell-type annotation using canonical immune markers
https://github.com/ag48665/Pilot-Hypoxia-Detection-using-Physiological-Signals
Objective: Develop a machine learning–based system for early detection of hypoxia in pilots using physiological signals.
Data: Multimodal physiological signals (e.g., heart rate, oxygen saturation, respiration)
Methods:
- Signal preprocessing and feature extraction
- Time-series analysis of physiological data
- Machine learning classification models
- Model evaluation (accuracy, ROC, confusion matrix)
- Data visualization and pattern detection
Bioinformatics: RNA-seq analysis • differential expression (DESeq2) • survival modeling (Cox, Kaplan–Meier) • functional enrichment (GO / KEGG) • prognostic modeling • single-cell RNA-seq (Scanpy) • TCGA / GEO data analysis • biomarker discovery
Machine Learning & Data Analysis: Supervised learning • classification models • feature selection • model evaluation (ROC, AUC, confusion matrix) • time-series analysis • physiological signal processing
Programming: R (tidyverse, survival, DESeq2) • Python (pandas, numpy, scikit-learn, scanpy)
Tools & Methods: Linux • Git • reproducible workflows • statistical modeling • data visualization (ggplot2, matplotlib, seaborn) • data preprocessing • pipeline development
Email: agatagabara@gmail.com LinkedIn: https://www.linkedin.com/in/agatha-gabara-06494a37/