Skip to content

wanunulab/NanoCortex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NanoCortex

Welcome to NanoCortex! This is the official codebase for NanoCortex: A Unified Agentic System for Nanopore Sequencing Analysis

We are dedicated to advancing the nanopore sequencing field by building intelligent, agent-driven tools that simplify analysis, improve reproducibility, and accelerate biological discovery.

NanoCortex aims to bridge fragmented nanopore software ecosystems through automated workflows, adaptive reasoning, and seamless integration with existing community tools. Our goal is to empower researchers to extract deeper insights from nanopore data with minimal manual intervention.

We welcome contributions from the community and hope NanoCortex will serve as a foundation for the next generation of intelligent nanopore analysis.

fig

Preprint Documentation License Python Agent


Overview

NanoCortex is a unified autonomous agentic framework designed for end-to-end data processing which ranges from raw signal basecalling to biological interpretation.

Framework

fig

Tool list

The following table summarizes external tools and resources integrated in NanoCortex, their primary functions, their implementation within the framework (AgentTool vs FunctionTool), and links to their official repositories or resources.

Tool / Resource Primary Function(s) Implementation Resource Link
Dorado Basecalling and modification detection (e.g., m5C, m6A, pseudouridine). AgentTool https://github.com/nanoporetech/dorado
Modkit Quantitative modification frequency reporting and data manipulation. AgentTool https://github.com/nanoporetech/modkit
Remora Signal-level analysis and visualization of raw nanopore data. AgentTool https://github.com/nanoporetech/remora
StringTie2 De novo transcriptome assembly and isoform quantification. AgentTool https://github.com/skovaka/stringtie2
FLAIR Isoform-level analysis, splicing dynamics, and fusion gene detection. AgentTool https://github.com/flairnlp/flair
RNA-FM Large-scale RNA foundation model for secondary structure prediction and embedding generation. Downstream modules include splice site prediction and support fine-tuning for various RNA tasks. AgentTool https://github.com/ml4bio/RNA-FM
NCBI BLAST Sequence similarity searching and iterative reference genome refinement. FunctionTool https://blast.ncbi.nlm.nih.gov/Blast.cgi
GTEx Database Integration of physiological transcriptomic baselines for clinical/biological comparison. AgentTool https://gtexportal.org/home/
PubMed / PMC Automated literature retrieval for context-aware interpretation of results. AgentTool https://pubmed.ncbi.nlm.nih.gov/
MODOMICS Retrieval of curated RNA modification annotations and biochemical pathways. AgentTool https://genesilico.pl/modomics/
Google Search General knowledge discovery and retrieval of web-accessible resources. AgentTool https://www.google.com
Custom R/Python Scripts Generation and Execution Generation of code for bioinformatics tools, including samtools, minimap2, and visualization workflows (e.g., density plots, modification landscapes), enabled by autonomous self-correction, validation, and execution. FunctionTool R: https://www.r-project.org/
Python: https://www.python.org/
samtools: https://github.com/samtools/samtools
minimap2: https://github.com/lh3/minimap2
bedtools: https://github.com/arq5x/bedtools2

Installation

Note: Detailed installation instructions and binary releases will be made available upon publication. Note: We highly recommend to save your codes in your folder each time by texting Nanocortex, that would helpful for saving tokens!

Step 1: Set Up Environment

For reproducibility and compatibility, we recommend creating a dedicated Conda environment for NanoCortex and its dependencies. To install the core Google ADK component, please refer to the official Conda-Forge feedstock: google-adk-feedstock.

conda create -n adk_env
conda activate adk_env
pip install \
    google-adk==1.22.1 \
    pandas==3.0.1 \
    scipy==1.17.1

Step 2: Register for a Google Cloud Account and Obtain an API Key

Certain key functionalities within NanoCortex require authenticated access to Google Cloud. You need to create a Google Cloud account and generate an API key. Keep your API key strictly confidential—never share it publicly or commit it to version control.

  1. Navigate to Google Cloud Console and sign in, or create a Google account if you do not already have one.
  2. Set up a new project as needed, and enable the relevant services (for example, Vertex AI and ADK).
  3. Generate your API key by following the instructions at Google Cloud Console – API Credentials.
  4. Refer to the official Google tutorial for detailed, step-by-step guidance on agent workforce configuration: Build Your First ADK Agent Workforce. Additional helpful documentation can be found at AISTUDIO.
  5. Store your API key securely on your local machine. And please put the API key into nano_plus/.env

Critical: Your API key is private and should be treated as a sensitive credential. Never expose, publish, or upload your key in any public repository or third-party service.

Step 3: Downlad Github files

pip/gitclone whatever you like to download the the whole folder in our github link.

Step 3: Install Singularity

Please follow the official Singularity documentation to install Singularity on your system.

We recommend using the pre-built Singularity image provided below: 👉 NanoCortex pre-built Singularity and skip this step! Click this link and download files. And we suggest using cluster/HPC/... as the files are so big. And please download "bot.sif" to NanoCortex/singularity (the same directory as "bot.def")

Skip this step if you already have downloaded bot.sif we provided in huggingface.

cd NanoCortex/singularity
singularity build bot.sif bot.def

This will generate a bot.sif image encompassing all required dependencies and software components as specified in bot.def. Once the image is built, you can execute and interact with NanoCortex directly within this container.

Validation:
To confirm that Singularity and the necessary tools have been installed successfully, run the following command:

singularity exec bot.sif modkit --help

If the command returns the Modkit help information without error, your installation is successful and the environment is fully operational.

Note: Ensure you have appropriate permissions and Singularity installed before running the build command. For detailed usage instructions and information on running the container, please refer to the official documentation.

If additional third-party tools or dependencies not encapsulated in the container are required, you will find comprehensive setup and usage guides in the project documentation and supplementary materials.

Note: If network restrictions or dependency-related issues are encountered during local Singularity image construction (e.g., failed apt-get installation, unavailable external repositories, or HPC-specific environment conflicts).

Step 5: start to use NanoCortex

cd ../nanoporePlus
adk web --host 0.0.0.0 --port 8000

And open http://localhost:8000 directly. Then choose nano_plus folder in top left corner. Now you can start to use NanoCortex!

Quick start

Quick start

Below are three example use cases for the NanoCortex agent framework (HeLa WT vs. TRUB1 KO context). Prompts are templates—substitute your file paths, region coordinates, and folder names. Expected output is left blank for your own benchmarking.

Example Prompt Notes
1 "Using the HEK 293T WT BAM at [WT_bam_path], investigate unaligned reads: where might they come from? Extract or summarize representative sequences and help interpret them with BLAST (or an equivalent search). Output any files in the current working directory." WT BAM; unaligned-read provenance; BLAST-oriented follow-up.
2 "You are given a pileup file for HeLa wild-type (WT) and TRUB1 knockout (KO): [pileup_path] which comes from modkit dmr. To identify sites downregulated in the KO relative to WT, and produce a sequence logo (9-mer) characterizing the motif at those downregulated KO sites. Output files should be saved to the current working directory." pileup input; differential sites; 9-mer logo for KO downregulated loci.
3 "For the motif associated with sites downregulated in TRUB1 KO: what is it, and what are its sequence or context preferences? Search the literature (e.g. Google Scholar) and summarize whether published mechanisms align with TRUB1 KO and RNA modification biology. Output any summary files or results to the current path." Motif interpretation + literature; cross-check with TRUB1 KO.

Note: All output files generated by these prompts should be saved to the current working directory. Fill in other columns as you test—this allows flexible benchmarking and detailed record-keeping as you develop or extend NanoCortex.

Note: You can find our example HeLa data in the NCBI Sequence Read Archive under accession PRJNA1459519. And if you want to test our software quickly, we provide unaligned, subsampling, and clean fasta file for HEK293T cell line, which can be found in example/data/hek293t_unalign_1000.fasta (Example1). We also provide the subsampling pileup file in example/data/KO_vs_WT_diff_mod_chr13.bed, whcih comes from HeLa cell line for WT and TRUB1 KO(Example2 & 3). Please see more information in our paper about this data.


Citation

If you use NanoCortex or any components of this ecosystem in your research, please cite:

BibTeX:

@article{nanocortex2026,
  title   = {[NanoCortex: A Unified Agentic System for Nanopore Sequencing Analysis]},
  author  = {[Qini Xia, Ziyuan Wang, Mina Shokoufandeh, Sara H. Rouhanifard, Meni Wanunu]},
  journal = {[bioarXiv]},
  year    = {2026},
  doi     = {[[DOI](https://doi.org/10.64898/2026.05.19.726254]}
}

our paper link is in: https://www.biorxiv.org/content/10.64898/2026.05.19.726254v1



License

MIT License — see LICENSE for details.

Contact

About

NanoCortex is a unified agentic system for nanopore sequencing analysis from wanunulab. Created by Qini Xia.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors