Gaurav Sablok, PhD
Research and Academic: I am applying widely as Bioinformatician, Computational Biology and Genomics, NGS Specialist, Research Scientist, Scientific Software Developer, Bioinformatics CTO, and Lead researcher roles as Senior CTO, Senior Computational Lead and Bioinformatician/Associate and taking immediately the position/job offer/chances and not going to delay for any reason. I work the areas of RNA-seq and Single cell, Metagenomics and Pangenomics covering all species and provide bio-software development, bioinformatics, machine and deep learning for Illumina and LongRead sequencing technologies. I prefer research areas in University and Organization and read and write, fullstack RUST for the scientific and academic research. As a part of the scientific and academic research, I am open to language classes to academics and research students.
Sequencing::
2010-2021: Plant, Bacterial, Fungi: RNASeq, GenomSeq, Phylogenomics, PacBio Sequencing, Single Cell Analysis, Bio-software.
2021:2023: Machine Learning, Bio-software.
2024: PanGenome, Bio-software.
2025: Human Genomics, Bio-software
Software Development: C++(2010-2021), 2024-: RUST
Web Development: RUST 🦀 i use Axum, Rocket, Dioxus, Actix, Warp, Yew, Leptos
Desktop,Terminal and HPC Development: eGUI, Iced, Druid, GLTK, Ratatui.
Machine and Deep Learning: RUST using Burn, Tch, Linfa, SmartCore, Candle.
Background:
I have a background in bioinformatics and extensive programming experience (prior to 2021 - C++, Bash, R - after 2021: Bash, Python, RUST) covering data analysis and data science, machine and deep learning and web and application development. After my PhD, I developed bioinformatics methods and software for transcriptional and post-transcriptional genomics across nuclear and organelle genomes at Fondazione Edmund Mach (Italy). I analyzed and finished multiple RNA-seq and organelle-Seq experiments for several plant and fungal species, including Arundo donax. Additionally i analyzed multiple metagenomics anlaysis coming from the fungal and bacterial species involving ITS metagenomics, as well as the bacterial metagenomics.Additionally, I have done a lot of work in the field of organelle genomics and have published the first Cardamine species' chloroplast genomes. I independently created an international partnership to find and create computational methods for a number of crop species. Following that, I spent two years (2014–2016) as a Research Fellow at the University of Technology, Sydney, Australia, where I developed computational methods for understanding seagrasses and computing cluster. Following that, I spent a short time in University of Connecticut, USA, where i analyzed the Douglas fir genome from the genome annotation to the phylogenomics and identifying genes and evolution of importance.
From August 2017-December 2021, I worked as a Postdoctoral researcher at the Finnish Museum of Natural History and the University of Helsinki, conducting research on genome bioinformatics and sequencing the genomes of lower plants, including Coleochaete orbicularis, Blasia pusilla, Chaetospiridium orbicularis, Polytrichum commune, Mallomonas, and Cryptomonas species and developing the bioinformatics and software for the HPC and museum. My work has been focused on genome assembly, genome annotation, chloroplast genomics, and a variety of other topics. Additionally, I've worked for various other organisations, such as Edinburgh UK, to analyse the genomics data for PAFTOL species and the chloroplast genomes of the Ambrosia clade from Norway. Since 2019, my research has shifted to examining the genomes of fungi whose species have been sequenced using NextSeq methods. This work is currently concentrated on genome assembly, annotations, markers genes, and phylogenomics of those fungi. I have assembled, annotated, and identified ITS and other phylogenomics markers, as well as performed alignments, phylogenies, and downstream analyses on the fungal genomes of over 500 different species. The bioinformatics application of high throughput sequencing and methods to comprehend the biological and functional importance of the genes, evolution, and pathways in plants have been the main areas of my research up to this point. From 2022-2023: Carrier advancement. From 2024 onwards, I worked at Universitat Potsdam, Germany, where I self-learnt RUST and develop approaches for machine and deep learning. There, I bench-marked PacBioHifi genome analysis and created a complete HMTL, CSS, Javascript enabled web. From 2025, I worked as Area Expert at Instytut Chemii Bioorganicznej Polskiej Akademii Nauk, Poland, and worked on human genomics and developed computational approaches and softwares for human genomics.
Bioinformatics Software Research: Over the years, I have worked with several languages such as C++ (2010-2021), R (2010-), Python (2021-), RUST (2024-) but mostly used one system programming language such as C++ (2010-2021) and then replaced with RUST(2024-). I still use Python and R as a part of the analysis and machine and deep learning only. As a fantatic reader and coder (I wrote and developed every software as single lead bioinformatician at every employment). Since 2024- I use only RUST as a full stack from bioinformatics, software, cloud and hpc management, web and machine and deep learning. Use of specific one to two selected languages for everything has enabled me to scale my abilities to everything at every employment and i served as a single lead bioinformatician from software developer to bioinformatics data analysis, machine learning and HPC cloud management. I am now using RUST as a full stack developer for bioinformatics, machine and deep learning, software and web development and python for data analysis and machine learning. I actively fork repositories, which are of use to me. I dont vibe code and use language models. Few of them are given below and some of these are under active development.
I am actively coding for multiple job offers hence some of them are under active development. See the last commit tag as the final build release for each of the source repository.
- eVaiUtilities: Variant analysis from the eVai.
- panscape: Pangenome long reads
- sequenceprofiler: Profiling sequence kmers for histograms
- phyloevolve: Long reads and alignments from the multiple alignments.
- hpcMapper: DevOPS system managment for the high performance computing.
- bacdive: Bacterial genome analysis from Bacdive.
- NLRanalyzer: Complete kit for analyzing NLR.
- humanCAST: Complete kit for human genome analysis.
- araseq: Complete kit for the Arabidopsis genome information.
- minifyseq: Noise removal from the long reads inclduing the machine learning based.
- CAGanalyzer: Analyzing the CAG repeats from the human genome.
- doiTAG: Generating doi for the sequences for next generation sequencing.
- varLinker: Analyzing and linking variants for annotation.
- vcfFilter: Filtering VCF files
- rustRet: Analyzing the massspectrometry data.
- repgnerate: Analyzing the sequencing information post sequencing.
- proteogenomics: implementing the proteogenomics methods.
- geomapper: a complete kit for geospatial analysis for German geo.
- ensemblcov: a complete kit for using the ensemblcov at your commandline.
- bactiPAN: complete bacterial pangenome analyzer both in shell and RUST.
- varView: Graphics enabled Variant terminal analyzer in RUST.
- doseqGO: A complete sequence information portal for standalone sequence information.
- varview: Paralel threaded sam viewer and filter.
- fastscan: Scan high through seqeuncing files.
- vcfscan: Scan all the variants and filter the variants.
- bacencode: Autoencoder and Decoder for any genome with variables.
- accusnv-rust: RUST version of accusnv for variant annotation.
- rustshap: Shap values implemented for RUST machine learning evaluation.
- dgnseq: Implementing the GNN network for the genomic sequences.
- rustshap: Implementing the shap in RUST.
- webml: bringing web to the machine learning.