Skip to content

Commit de94ba6

Browse files
committed
[feat] Refactor CV to one-page format for drug discovery roles
1 parent caa8a66 commit de94ba6

File tree

1 file changed

+83
-144
lines changed

1 file changed

+83
-144
lines changed

index.md

Lines changed: 83 additions & 144 deletions
Original file line numberDiff line numberDiff line change
@@ -1,170 +1,109 @@
11
---
2-
title: Machine Learning Scientist
2+
title: Machine Learning Scientist — Drug Discovery (Foundation Models)
33
layout: default
44
---
55

6-
## Virtual Cells in Drug Discovery
6+
London, UK • linkedin.com/in/ctr26 • github.com/ctr26 • Google Scholar
7+
*Focus:* Virtual cells, multi‑modal foundation models, omics + imaging, precision medicine
78

8-
<p style="text-align: justify;">
9-
Machine Learning Scientist specialising in virtual cell development for drug discovery. Strong background in building multi-modal foundation models that integrate epigenomic, transcriptomic, and phenotypic data. Experienced in applying AI to biological problems across scales, from molecular interactions to whole-organism imaging, with a focus on practical applications in precision medicine.
10-
</p>
9+
## Professional Summary
10+
Machine Learning Scientist specialising in **virtual cell** development for drug discovery. Built and deployed **multi‑modal foundation models** that integrate knowledge graphs, text, transcriptomic and phenotypic imaging data. Experience spans molecular interactions to whole‑organism imaging. Comfortable leading across science and engineering—MLOps at TB‑scale, reproducible pipelines, and cross‑functional collaboration with biology, chemistry and platform teams.
1111

12-
## Employment History
12+
## Core Strengths
13+
Foundation models • Representation/self‑supervised learning • Multi‑modal fusion • Knowledge graphs • Biological sequence & transcriptomics • High‑content imaging • GNNs • OOD/robustness • Scaling & performance • Reproducible ML (MLOps) • Scientific communication
1314

14-
### Senior Machine Learning Scientist
15-
**Valence Labs @ Recursion Pharmaceuticals** | London, UK | Oct 2024 – Present
15+
## Experience
1616

17-
Contributing to virtual cell foundation model development at the AI-drug discovery interface:
18-
- **Virtual Cell Initiative**: Fine-tuning multi-modal LLMs on integrated omics data to create predictive biological models
19-
- **[TxPert](https://neurips.cc/virtual/2025/loc/san-diego/poster/116558)**: Co-developed SOTA transcriptomic perturbation prediction model with novel systems biology knowledge graph integration (NeurIPS 2025)
20-
- **Boltz2 Project**: Contributing to proteome-wide virtual drug screening platform
21-
- **Community Leadership**: Virtual Cell Journal Club organizer
17+
**Senior Machine Learning Scientist — Valence Labs @ Recursion Pharmaceuticals**
18+
London, UK • Oct 2024 – Present
19+
- **Virtual Cell initiative:** Fine‑tune multi‑modal LLMs over **knowledge graphs + text + RNA‑seq + phenotypic imaging** to model cell state and predict gene/drug responses; partnered closely with biology to design benchmark tasks and success metrics.
20+
- **TxPert (NeurIPS 2025):** Co‑developed a **state‑of‑the‑art transcriptomic perturbation predictor** using systems‑biology KGs; contributed training code, data curation, and ablations.
21+
- **Boltz2 project:** Contributed to proteome‑scale **virtual drug screening** components; supported evaluation strategy and error analysis across targets.
22+
- **Community:** Organiser, **Virtual Cell Journal Club**; fostered reading group bridging ML and wet‑lab teams.
2223

23-
### Senior Research Associate & AI Engineering Lead
24-
**EMBL-EBI** | Cambridge, UK | Dec 2022 – Oct 2024
24+
**Senior Research Associate & AI Engineering Lead — EMBL‑EBI (Uhlmann Group & Bio‑Image Archive)**
25+
Cambridge, UK • Dec 2022 – Oct 2024
26+
- **Team leadership:** Supervised **6 PhD students**; established coding standards, CI, and peer‑review practices used across the lab.
27+
- **Spatial biology:** Built deep‑learning pipelines for **high‑content cell morphology** and single‑cell feature learning; integrated with public bioimage resources.
28+
- **Open‑source:** Created **bioimage_embed** (self‑supervised biological images) and **shape_embed** (cell‑shape DL toolkit); productionised training/inference.
29+
- **MLOps:** Designed scalable pipelines processing **TB‑scale microscopy datasets** across HPC and cloud; containerised workflows, automated experiment tracking.
30+
- **Academic service:** Reviewer — ISBI 2022/2023, **ICASSP** 2024.
2531

26-
Led AI initiatives for the Uhlmann Group and Bio-Image Archive:
27-
- **Leadership**: Supervised 6 PhD students, establishing coding standards and peer review processes
28-
- **Spatial Biology**: Developed deep learning approaches for high-content cell morphology analysis
29-
- **[bioimage_embed](https://github.com/ctr26/bioimage_embed)**: Created self-supervised learning framework for biological images
30-
- **[shape_embed](https://github.com/ctr26/shape_embed)**: Deep learning toolkit for cell shape analysis
31-
- **MLOps**: Built scalable pipelines processing TB-scale microscopy datasets
32-
- **Academic Service**: Reviewer for ISBI 2022/2023, ICASP 2024
32+
**AI/ML Founding Engineer — Amun AI AB**
33+
Stockholm, Sweden • 2022 – 2024
34+
- Built **GKE/Kubernetes** platform for model serving with **NVIDIA Triton/KServe**; supported **100+ models** for **30+ daily users** with auth, monitoring and autoscaling.
3335

34-
### AI/ML Founding Engineer
35-
**Amun AI AB** | Stockholm, Sweden | 2022 – 2024
36+
**AI/ML Engineering Consultant — DeepMirror**
37+
Cambridge & London, UK • 2022 – 2024
38+
- **MouseMindMapper:** Automated brain‑histology segmentation product generating **£50k annual revenue**; delivered end‑to‑end data, training, packaging and docs.
39+
- Wrote a high‑performance **C++ cheminformatics fingerprinting** library for production use.
3640

37-
- **NVIDIA Partnership**: Scalable Kubernetes-based model serving infrastructure
38-
- **Production Systems**: GKE infrastructure serving 100+ models to 30+ daily users
41+
**Data Scientist — Brazma Group, EMBL‑EBI**
42+
Cambridge, UK • Dec 2019 – Dec 2023
43+
- Co‑authored the successful **AI4LIFE €5M** grant (federated bioimage AI infrastructure); contributed to platform architecture and model‑sharing strategy.
44+
- Drove **large‑scale AI microscopy** analyses in the Image Data Resource; collaborated with **Google Cloud** on representation learning.
45+
- Taught annual deep‑learning courses to **40+ researchers** (PhD to PI).
3946

40-
### AI/ML Engineering Consultant
41-
**DeepMirror** | Cambridge & London, UK | 2022 – 2024
47+
**Software Engineer (COVID‑19 Response) — European Nucleotide Archive, EMBL‑EBI**
48+
Cambridge, UK • Mar 2020 – Sept 2020
49+
- Built CI/CD for the **COVID‑19 Data Portal** to enable **daily global data updates**.
50+
- Scaled NGS alignment and **Nextflow/Kubernetes** ETL pipelines for surging data volumes.
4251

43-
- **[MouseMindMapper](http://github.com/deepmirror)**: Automated brain histology segmentation (£50k annual revenue)
44-
- **Cheminformatics**: High-performance C++ molecular fingerprinting library
45-
46-
### Data Scientist
47-
**Brazma Group, EMBL-EBI** | Cambridge, UK | Dec 2019 – Dec 2023
48-
49-
- **Education**: Annual deep learning courses for 40+ participants (PhD to PI level)
50-
- **[AI4LIFE Grant](https://ai4life.eurobioimaging.eu/)**: Co-wrote successful €5M proposal for federated AI infrastructure
51-
- **AI4LIFE Platform**: Contributing to the development of federated bioimage analysis tools and model sharing
52-
- **Image Data Resource**: Large-scale AI-driven microscopy analysis
53-
- **Industry Partnerships**: Google Cloud collaboration for representation learning
54-
55-
### Software Engineer (COVID-19 Response)
56-
**European Nucleotide Archive, EMBL-EBI** | Cambridge, UK | Mar 2020 – Sept 2020
57-
58-
- **[COVID-19 Data Portal](https://www.covid19dataportal.org/)**: CI/CD pipelines enabling daily global data updates
59-
- **NGS Processing**: Scaled sequence alignment pipelines for large data volumes
60-
- **Infrastructure**: Containerized ETL pipelines using Nextflow and Kubernetes
61-
62-
### Computational Microscopist
63-
**National Physical Laboratory** | London, UK | 2018 – Dec 2019
64-
65-
- Novel 3D organoid segmentation algorithms for cancer research
66-
- MSquared consultancy on advanced imaging solutions
52+
**Computational Microscopist — National Physical Laboratory**
53+
London, UK • 2018 – Dec 2019
54+
- Developed novel **3D organoid segmentation** methods for cancer research; delivered consultancy to MSquared on advanced imaging.
6755

6856
## Education
6957

70-
### PhD in Engineering
71-
**University of Cambridge** | 2014 – 2018 | EPSRC PES-CDT Studentship
72-
*Thesis*: "Light-sheet microscopy for tracking particles in large specimens"
73-
*Focus*: Microscope automation and biological image analysis
74-
- Designed and built novel light-sheet microscope with automated acquisition
75-
- Developed algorithms for particle tracking and optimising signal collection
76-
- Created homographic signal generation and micrometer-scale tomography algorithms
77-
- Supervised 3 students (2 MRes, 1 BSc)
78-
79-
### MRes in Photonics
80-
**University of Cambridge & UCL** | 2013 – 2014 | [EPSRC Photonics CDT](https://www.pes-cdt.org/)
81-
- Structured illumination microscopy reconstruction software
82-
- Modules: Computer Vision, Quantum Mechanics, Photonics
83-
84-
### MSci in Physics (First Class Honours)
85-
**Nottingham Trent University** | 2009 – 2013
86-
- Top physics student in graduating class
87-
- Mountaineering Club President (2011-2012)
88-
89-
## Patents
90-
91-
<div id="patents-list">
92-
<!-- Patents will be loaded dynamically -->
93-
</div>
94-
95-
## Selected Publications
96-
97-
<div id="publications-list">
98-
<!-- Publications will be loaded dynamically via API only -->
99-
</div>
100-
101-
<script src="{{ '/assets/js/publications.js' | relative_url }}"></script>
102-
103-
<a href="https://scholar.google.com/citations?user=XVt7BYQAAAAJ&hl=en" target="_blank" rel="noopener">View all publications on Google Scholar →</a>
58+
**PhD, Engineering — University of Cambridge** • 2014 – 2018 (EPSRC PES‑CDT)
59+
*Thesis:* “Light‑sheet microscopy for tracking particles in large specimens”
60+
- Designed & built a novel light‑sheet microscope with automated acquisition.
61+
- Algorithms for particle tracking, signal optimisation, and micrometre‑scale tomography.
62+
- Supervision: 2× MRes, 1× BSc.
10463

105-
## Technical Skills
64+
**MRes, Photonics — University of Cambridge & UCL** • 2013 – 2014 (EPSRC Photonics CDT)
65+
- Structured‑illumination microscopy reconstruction; modules in Computer Vision, Quantum Mechanics, Photonics.
10666

107-
### Machine Learning & AI
108-
- **Foundation Models**: Multi-modal LLM fine-tuning, OOD prediction, self-supervised learning
109-
- **Deep Learning**: PyTorch, TensorFlow, Lightning, probabilistic programming (Pyro.ai), Hugging Face
110-
- **Computer Vision**: Bio-image analysis, 3D reconstruction, segmentation, super-resolution
111-
- **Biological Data**: snRNAseq, bulk RNAseq, histopathology, fluorescence imaging, GNNs for systems biology
67+
**MSci, Physics (First‑Class Honours) — Nottingham Trent University** • 2009 – 2013
68+
- Top physics graduate; President, Mountaineering Club (2011–2012).
11269

113-
### Programming & Infrastructure
114-
- **Languages**: Python (8+ years), R, MATLAB, C++, Java
115-
- **Scientific Computing**: NumPy, SciPy, Pandas, scikit-learn, OpenCV, scikit-image
116-
- **GPU Computing**: Multi-GPU training on NVIDIA A100/V100 clusters, CUDA optimization, distributed training across HPC environments, SLURM job scheduling, GCP/AWS GPU instances
117-
- **Cloud & DevOps**: GCP, AWS, Kubernetes, Docker, CI/CD, Terraform
118-
- **Workflow Management**: Snakemake, Nextflow, Apache Airflow
119-
- **Model Deployment**: NVIDIA Triton, KServe, MLflow
70+
## Selected Publications & Preprints
71+
- See **Google Scholar** for the full list. Include **3–5** most relevant to drug discovery/foundation models directly here when submitting to roles.
72+
- Example placeholders:
73+
- *TxPert: Transcriptomic Perturbation Prediction with Systems‑Biology KGs* (NeurIPS 2025).
74+
- *Self‑supervised representation learning for biological images* (bioimage_embed).
12075

121-
## Open Source Projects
122-
123-
### Major Contributions
76+
## Patents
77+
- If applicable, list 1–3 relevant filings with **title • number • year • brief contribution**. Remove this section if none.
12478

125-
- **[bioimage_embed](https://github.com/ctr26/bioimage_embed)**: Self-supervised learning framework for biological image analysis
126-
- **[shape_embed](https://github.com/ctr26/shape_embed)**: Deep learning toolkit for cell shape analysis
127-
- **[pydeconv](https://github.com/ctr26/pydeconv)**: GPU-accelerated deconvolution library for microscopy
128-
- **[nuclear_phenotyping](https://github.com/ctr26/nuclear_phenotyping)**: Automated nuclear morphology analysis pipeline
79+
## Open Source (Selected)
80+
- **bioimage_embed** — Self‑supervised learning for biological images.
81+
- **shape_embed** — Deep‑learning toolkit for cell‑shape analysis.
82+
- **pydeconv** — GPU‑accelerated deconvolution for microscopy.
83+
- **nuclear_phenotyping** — Automated nuclear morphology analysis.
84+
- Contributions to **Hypha Platform**, **BioImage Model Zoo**, **BIA Binder**, **Hypha Helm Charts**, **COVID Workflow Manager**.
12985

130-
### Infrastructure Projects
86+
## Skills
13187

132-
- **[Hypha Platform](https://hypha.imjoy.io/)**: Distributed computing platform for bioimage analysis
133-
- **[BioImage Model Zoo](https://bioimage.io)**: Community-driven repository for deep learning models
134-
- **[BIA Binder](https://binder.bioimagearchive.org/)**: Interactive notebook platform for bioimage analysis
135-
- **[Hypha Helm Charts](https://github.com/amun-ai/hypha-helm-charts)**: Kubernetes deployment templates
136-
- **[COVID Workflow Manager](https://github.com/enasequence/covid-workflow-manager)**: ETL pipeline orchestration
88+
**ML & AI:** Foundation‑model fine‑tuning, contrastive/self‑supervised learning, OOD & uncertainty, evaluation/ablation design
89+
**Frameworks:** PyTorch, TensorFlow, Lightning, Pyro, Hugging Face, scikit‑learn
90+
**Vision & Bio:** Bioimage analysis, 3D reconstruction, segmentation/super‑resolution, **snRNA‑seq/bulk RNA‑seq**, histopathology, fluorescence imaging, **GNNs**, knowledge graphs
91+
**Languages:** Python (primary), R, MATLAB, C++, Java
92+
**Compute:** Multi‑GPU training (A100/V100), CUDA, distributed training, SLURM, HPC, GCP/AWS GPU instances
93+
**MLOps/Infra:** Kubernetes, Docker, **NVIDIA Triton**, **KServe**, MLflow, CI/CD, Terraform
94+
**Workflows:** Nextflow, Snakemake, Apache Airflow
13795

13896
## Grants & Awards
139-
140-
- **AI4LIFE** (2022): Co-investigator on €5M EU grant for federated bioimage AI infrastructure
141-
- **EPSRC CDT Studentship** (2013-2018): [Photonic and Electronic Systems CDT](https://www.pes-cdt.org/) full funding (£120k)
142-
- **Nuffield Research Bursary** (2012): Computer vision for liquid crystal flow analysis
143-
- **Institute of Physics Grant** (2009-2012): Household means support
144-
145-
## Teaching & Leadership
146-
147-
### Course Development & Delivery
148-
- **Deep Learning for Bioimage Analysis** (2019-2023): Annual course at EMBL-EBI for 40+ participants
149-
- **Mathematics Tutorials** (2015-2018): First-year Natural Sciences at Magdalene College, Cambridge
150-
- **C Programming Workshops** (2015): Clare College Summer School
151-
- **Quantum Mechanics & Scientific Ethics** (2014-2016): Reach Summer School, Cambridge
152-
153-
### Research Supervision
154-
- 6 PhD students in AI and spatial biology (2022-2024)
155-
- 3 project students during PhD (2014-2018)
156-
157-
### Community Leadership
158-
- Lower Boats Captain & Coach, Magdalene College (2014-2018)
159-
- Mountaineering Club President, NTU (2011-2012)
160-
161-
---
162-
163-
## Professional Service
164-
165-
- **Peer Review**: Nature Methods, Scientific Reports, Journal of Microscopy, ISBI, ICASP
166-
- **Conference Presentations**: FOM (2018, 2022, 2023), MMC (2018, 2022), CBIAS (2023)
167-
- **Open Source**: Maintainer of multiple scientific software packages
168-
- **Professional Membership**: Institute of Physics
169-
170-
*References available upon request*
97+
- **AI4LIFE** (2022) — Co‑investigator on **€5M** EU grant (federated bioimage AI)
98+
- **EPSRC CDT Studentship** (2013–2018) — Photonic & Electronic Systems CDT (£120k)
99+
- **Nuffield Research Bursary** (2012) — Computer vision for liquid‑crystal flows
100+
- **Institute of Physics** grant support (2009–2012)
101+
102+
## Teaching, Mentoring & Service
103+
- **Course lead:** Deep Learning for Bioimage Analysis (2019–2023), 40+ participants/year
104+
- **Supervision:** 6 PhD students (AI & spatial biology) + 3 project students (PhD years)
105+
- **Peer review:** Nature Methods, Scientific Reports, Journal of Microscopy, ISBI, **ICASSP**
106+
- **Talks & conferences:** FOM (2018, 2022, 2023), MMC (2018, 2022), CBIAS (2023)
107+
- **Community leadership:** Rowing captain/coach (Magdalene College), Mountaineering Club President (NTU)
108+
109+
*References available upon request.*

0 commit comments

Comments
 (0)