Skip to content

sathish-t/nanalogue-gui

Repository files navigation

nanalogue-gui

Electron GUI for Nanalogue: interactive sequence data analysis and curation with a focus on single molecules and DNA/RNA modifications.

License: MIT codecov CI

Nanalogue-gui provides a desktop application for working with BAM/CRAM/Mod-BAM files, with a focus on single-molecule DNA/RNA modifications. It builds on @nanalogue/node to provide interactive visualisation and curation workflows.

Nanalogue-gui is part of the Nanalogue family of tools for nanopore data analysis. Sister packages include nanalogue (Rust CLI and library), pynanalogue (Python wrapper), and @nanalogue/node (Node.js bindings).

Landing page

Table of Contents

Requirements

  • Node.js 22 or higher

Installation

You can download the release binaries from Github. Please look at the binaries attached to each release, and download the binary of your platform (macOS/Linux). For Windows, we recommend running our linux binary using the Windows subsystem for Linux (WSL). Please see this link or equivalent to learn about WSL.

NOTE: In Macs, if you download the binary, you may have to ignore a warning from Gatekeeper saying the developer is unknown. To avoid this warning, the authors here have to apply for an Apple developer license etc. so we have chosen to not do this. Please note that this project is open source, so you are free to inspect the source code here. You can always build from source to avoid such warnings.

To build from source, please use the command below. You would first need to install utilities like npm and git.

git clone https://github.com/sathish-t/nanalogue-gui.git
cd nanalogue-gui
npm install

Alternatively, if you only want the nanalogue-chat CLI (no Electron GUI), clone the repo, build, and link it globally:

git clone https://github.com/sathish-t/nanalogue-gui.git
cd nanalogue-gui
npm install
npm run build
npm link

This puts the nanalogue-chat command on your PATH from any directory. See the CLI section for usage.

Usage

If you have installed the app from a binary, just launch the binary like you normally would. If you built it yourself, you can launch it from the command line like this:

npm start

This launches the landing page where you can choose between QC, Swipe, Locate Reads, and AI Chat modes. Three font-size buttons (small/medium/large) in the header scale all text in the app, including chart labels and legend.

Modes

Mode overview

Mode Entry point Input Output Key use case
QC GUI button BAM/CRAM/Mod-BAM file or URL Interactive charts (no file output) Assess read quality, length distribution, modification patterns
Swipe GUI button BAM/CRAM + BED file (annotations) Modified BED file with decisions Accept/reject annotated features by visual inspection
Locate Reads GUI button BAM/CRAM + text file (read IDs) BED6 file with coordinates Find genomic positions of specific reads
AI Chat GUI button or nanalogue-chat CLI BAM/CRAM directory + natural language question LLM response + optional exported files Ask complex questions; LLM generates and executes Python
nanalogue-sandbox-exec nanalogue-sandbox-exec CLI Directory + Python script Script output (stdout/files) Run reproducible analysis scripts without LLM involvement

QC

Quality control analysis of BAM/CRAM/Mod-BAM files. Generates interactive charts covering read lengths, yield, analogue density, modification probabilities, and per-read sequences.

We demonstrate QC with a small BAM file containing simulated sequencing data.

The configuration screen allows setting:

  • BAM/CRAM source (local file or URL)
  • Modification filter (e.g., +T, -m, +a)
  • Genomic region (e.g., chrI:1000-50000)
  • Mod region to restrict modification filtering to a sub-region
  • Read length histogram resolution (1 / 10 / 100 / 1,000 / 10,000 bp)
  • Sample fraction (0.01%--100%) with deterministic seed
  • Window size (2--10,000 bases of interest)
  • Advanced options: MAPQ filters, read type filters, length filters, read ID file, base quality and probability thresholds

QC configuration screen:

QC configuration

QC configuration with a loaded BAM file:

QC configuration with loaded BAM

QC result tabs:

  • Read Lengths: histogram of aligned read lengths with summary statistics
  • Yield Curve: cumulative yield by read count with total yield and N50
  • Analogue Density: whole-read and windowed density histograms with optional range filters
  • Raw Probability: modification probability distribution with optional range filter
  • Sequences: per-read modification highlighting with quality tooltips, row selection, and read ID copy. Insertions and deletions are shown through lowercase bases and '.' respectively.

Read length distribution:

Read Lengths

Yield:

Yield Curve

Analogue density histogram:

Analogue Density

Modification probability distribution:

Please note that this is from a simulated BAM file with artificial data.

Raw Probability

Per-read modification sequences:

Sequences

Swipe

Interactive annotation curation. Displays modification signal plots for each annotation in a BED file, allowing the user to accept or reject each one.

The configuration screen allows setting:

  • BAM/CRAM source (local file or URL)
  • BED annotations file path (You need a BED file with at least four tab-separated columns: contig, start, end, read name)
  • Output file path
  • Modification filter (e.g., +T, -m)
  • Window size (for windowed density)
  • Flanking region size (base pairs)
  • Annotation highlight visibility

Swipe configuration

Controls:

  • Right arrow or Accept button: accept the annotation
  • Left arrow or Reject button: reject the annotation

Reviewing an annotation:

The grey points below are modification probabilities per genomic coordinate for which such information is available. The black line is windowed modification density i.e. we threshold the modification probability and then ask what fraction of bases are modified within each window.

Swipe review

Reviewing another annotation:

The grey and black annotations mean the same as above.

Swipe review (another annotation)

Locate Reads

Converts a list of read IDs into a BED file by looking up their genomic coordinates in a BAM/CRAM file. Useful for finding where specific reads of interest map in the genome.

The configuration screen allows setting:

  • BAM/CRAM source (local file or URL)
  • Read ID file (plain text, one read ID per line)
  • Region (optional, e.g., chr3 or chrI:1000-50000) to speed up processing
  • Full region checkbox to restrict to reads that completely span the region
  • Output BED file path

Output is tab-separated BED6 (contig, start, end, read_id, score, strand):

chr1	100	600	read_abc	1000	+
chr2	200	700	read_def	1000	-

After generation, a summary shows the number of BED entries written, read IDs not found in the BAM, and unmapped reads excluded from the output.

AI Chat

Experimental mode for asking natural-language questions about BAM files. Connects to any OpenAI-compatible API endpoint (local or remote) and queries BAM data in a sandboxed environment. For a detailed explanation of how AI Chat works under the hood, see documentation/ai-chat.md.

AI Chat works with any provider that exposes an OpenAI-compatible v1 API endpoint. This includes cloud providers like OpenAI, Anthropic, Google Gemini, Mistral, Together AI, Fireworks, and OpenRouter, as well as local runners like Ollama and LM Studio.

Setting up a provider

To use AI Chat, you need three things: an API endpoint URL, an API key (if the provider requires one), and a model name. Any other provider that supports the OpenAI v1 chat completions protocol (POST /v1/chat/completions) will also work. Please note that unless you have a local LLM that you or your organization are running, it is likely that you will be charged per request to these URLs.

We have tested for the following providers either manually or automatically in this repository.

Provider Endpoint URL API key How to get started
OpenAI https://api.openai.com/v1 required Create an account and generate an API key at platform.openai.com/api-keys
Anthropic https://api.anthropic.com/v1/ required Create an account and generate an API key at console.anthropic.com
Google https://generativelanguage.googleapis.com/v1beta required Create an account and generate an API key at aistudio.google.com/apikey

You can also start an LLM using a package like llama.cpp, and then use the URL. In this scenario, you or your organization are running the LLM yourselves. For example, this command starts a small Qwen LLM and exposes it at http://url_of_the_computer:8000. You have to install llama.cpp, configure its parameters, choose a model etc. The command shown below is for illustrative purposes only--the flags etc. may have changed since the time of writing. A bigger model on a computer with better hardware specs will generally give better results. For a step-by-step guide covering installation, model selection, and recommended flags, see documentation/local-llm-setup.md.

./build/bin/llama-server \
      -hf Qwen/Qwen3-8B-GGUF:Q4_K_M \
      --jinja \
      -ngl 99 \
      -c 32768 \
      -n 8192 \
      --host 0.0.0.0 \
      --port 8000

These are other suggestions for LLM providers/setups that we haven't tested ourselves.

Provider Endpoint URL API key How to get started
Ollama (local) http://localhost:11434/v1 not required Install Ollama, then ollama pull <model>
LM Studio (local) http://localhost:1234/v1 not required Download a model from the LM Studio UI
OpenRouter https://openrouter.ai/api/v1 required Sign up and create a key at openrouter.ai/keys; gives access to many models
Together AI https://api.together.xyz/v1 required Sign up and create a key in the dashboard
Fireworks https://api.fireworks.ai/inference/v1 required Sign up and create a key in the dashboard

Choosing a model

Use the Fetch Models button after entering your endpoint URL and API key to see which models are available from your provider. For best results, choose a model that supports returning Python code, since AI Chat relies on Python code run in a sandbox to query your BAM data. Larger models generally give better answers but respond more slowly; smaller models are faster but may struggle with complex questions.

The configuration screen allows setting:

  • BAM directory path
  • API endpoint URL (defaults to http://localhost:11434/v1 for Ollama)
  • API key (if your provider requires one)
  • Model name (with Fetch Models button for auto-discovery)
  • Advanced options including sandboxed code execution (see documentation/advanced-options.md for a full reference)

NOTE: Depending on the provider and model you choose, your context lengths etc. could be different. If possible, please look up your model's parameters such as context length and change them using the Advanced options link in the screenshots below. For example, the GPT-5.2 model uses a 400K context window (see here) whereas Deepseek v2 lite uses a 32K context window (see here). A higher context window means the model can remember more of your chat, and can respond to longer prompts. Whether this is relevant for you depends on how you use the chat feature here.

Using the AI chat GUI

AI Chat with a connected endpoint:

AI Chat with connected endpoint

Asking a question about BAM data:

AI Chat question and response

Multi-turn conversation:

AI Chat multi-turn conversation

Display sandboxed code:

We use a Python sandbox to receive code from the LLM and execute it. Our sandbox uses the Monty package from pydantic to run Python code in a secure way so that it has access only to our files and to specific functions. This ensures our sandbox is secure, and allows us to inspect what the LLM is doing. A copy button in the code panel lets you copy the Python code to your clipboard.

AI Chat multi-turn conversation with sandbox

Inspect LLM instructions:

You can dump the full request payload sent to the LLM (system prompt and conversation history) by typing /dump_llm_instructions in the chat input or CLI REPL. The output is written as plain text to ai_chat_output/ inside your BAM directory. At least one message must have been sent to the LLM before the dump command will produce output.

Inspect the system prompt:

You can dump the static system prompt (the portion of the LLM instructions that describes the sandbox capabilities, without the dynamic conversation history) by clicking the View System Prompt button in the GUI, or by typing /dump_system_prompt in the CLI REPL. The output is written to ai_chat_output/ inside your BAM directory and is available at any point, even before the first message has been sent. The button also shows a rough token count (~N tokens) for the prompt.

Customise the system prompt:

Place a SYSTEM_APPEND.md file in your BAM directory to append additional instructions to the default system prompt. The file is loaded once at session start and its content is inserted after the built-in sandbox instructions. Use this for small amounts of domain-specific context — for example, organism background, project-specific conventions, or guidance on which modification types to focus on. Files larger than 64 KB are silently ignored. Use /dump_system_prompt to verify the full effective prompt after loading. This feature is available in both the GUI and the CLI.

Run your own Python scripts:

You can bypass the LLM entirely and run a Python file directly in the sandbox using the /exec command (available in both the GUI chat input and the CLI REPL). The file must be a .py file inside your BAM directory. The script runs with the same sandbox permissions as LLM-generated code (access to your BAM files and the built-in helper functions), but the results are not sent to the LLM conversation.

CLI (nanalogue-chat)

The same AI Chat analysis engine is available as a standalone terminal REPL, with no Electron or GUI needed.

End-users (installed via npm install -g nanalogue-gui):

nanalogue-chat --endpoint <url> --model <name> --dir <path>

Developers (built from source):

node dist/cli.mjs --endpoint <url> --model <name> --dir <path>

The remaining examples below use nanalogue-chat; developers substitute node dist/cli.mjs.

For example, to chat about BAM files in ./data using a local Ollama model:

nanalogue-chat --endpoint http://localhost:11434/v1 --model llama3 --dir ./data

Authentication: pass your API key with --api-key <key>, or set the $API_KEY environment variable. Local runners like Ollama do not require a key.

Model discovery: use --list-models with an endpoint to see which models are available:

nanalogue-chat --endpoint https://api.openai.com/v1 --api-key $API_KEY --list-models

Non-interactive mode: use --non-interactive "<message>" to send a single prompt, print the response, and exit — useful for scripting. You can combine this with a custom SYSTEM_APPEND.md to describe complex tasks to the LLM (see above).

nanalogue-chat --endpoint <url> --model <name> --dir <path> \
    --non-interactive "What is the average read length?"

System prompt customisation: use --system-prompt "<text>" to replace the built-in sandbox prompt (the SYSTEM_APPEND.md file and facts array are still appended to the prompt). Use --dump-llm-instructions --non-interactive "<msg>" to write the full LLM request payload (system prompt + conversation) to a dated log file in ai_chat_output/.

Remove tools: use --rm-tools "tool1,tool2" to disable specific external functions (e.g. --rm-tools "write_file,read_file") — useful for restricting sandbox capabilities.

REPL commands:

Command Action
/new Start a new conversation
/exec <file.py> Run a Python file directly in the sandbox without sending it to the LLM
/dump_llm_instructions Dump the full LLM request payload from the last round to a log file in ai_chat_output/
/dump_system_prompt Dump the static system prompt to a log file in ai_chat_output/
/quit Exit the CLI
Ctrl+C during a request Cancel the current request
Ctrl+C at the prompt Exit

For the full list of providers and endpoint URLs, see the Setting up a provider table above.

Disabling colour output: set NO_COLOR=1 to suppress ANSI colour codes — useful when piping output to a file or script:

NO_COLOR=1 nanalogue-chat --endpoint <url> --model <name> --dir <path>

Advanced options (context window size, timeouts, record limits) are available via flags — run nanalogue-chat --help for a quick reference, or see documentation/advanced-options.md for the full reference including GUI equivalents.

CLI (nanalogue-sandbox-exec)

nanalogue-sandbox-exec runs a Python script directly in the Monty sandbox without involving an LLM. It has access to the same BAM helper functions as AI Chat (e.g. read_info, bam_mods, write_file) and the same security model (access restricted to the --dir directory), but the script and its output are never sent to a language model. This makes it useful for batch processing, reproducible analyses, and CI pipelines where LLM involvement is not needed.

End-users (installed via npm install -g nanalogue-gui):

nanalogue-sandbox-exec --dir <path> <script.py>

Developers (built from source):

node dist/execute-cli.mjs --dir <path> <script.py>

--dir is required and must be the directory containing your BAM files. The script path is resolved relative to the current working directory.

Output is written to stdout; errors are written to stderr. Exit codes:

  • 0 — sandbox ran successfully (stdout may be empty for silent scripts)
  • 1 — bad arguments, file not found, or sandbox error

Example — run an analysis script and capture its output:

nanalogue-sandbox-exec --dir ./data analyse.py > results.txt

Sandbox limits and output size can be tuned with flags — run nanalogue-sandbox-exec --help for a quick reference, or see documentation/advanced-options.md for the full reference.

Development

# Build the project (produces dist/main.js, dist/renderer.js, dist/cli.mjs, and dist/execute-cli.mjs)
npm run build

# Run in development mode
npm run dev

# Run tests
npm test

# Lint (Biome, ESLint, Stylelint, html-validate)
npm run lint

# Auto-fix linting issues
npm run lint:fix

# TypeScript type checking
npx tsc --noEmit

Development environment:

  • Node.js >= 22 is required to run the app
  • Development and CI use Node 22 as the minimum version; Node 24 is also tested
  • The package-lock.json is generated with Node 22's npm (v10) via a pre-commit hook, so npm ci works consistently across Node versions
  • https://github.com/nvm-sh/nvm is required for development — the pre-commit hook uses nvm exec 22 to keep the lock file compatible

Note that the development environment notes are only relevant if you intend to play with the code or add features. They are not relevant if you want to just install and run the app. Just go to the installation section of this document.

The documentation/code-explainers/ folder contains short write-ups explaining how specific parts of the codebase work — useful if you want to understand a feature before diving into the source.

For detailed testing guidance (how to run tests, coverage enforcement, mocking patterns), see documentation/testing.md.

Versioning

We use Semantic Versioning.

Current Status: Pre-1.0 (0.x.y)

While in 0.x.y versions:

  • The API may change without notice
  • Breaking changes can occur in minor version updates

After 1.0.0, we will guarantee backwards compatibility in minor/patch releases.

Changelog

See CHANGELOG.md for version history.

License

MIT License - see LICENSE for details.

Acknowledgments

This software was developed at the Earlham Institute in the UK. This work was supported by the Biotechnology and Biological Sciences Research Council (BBSRC), part of UK Research and Innovation, through the Core Capability Grant BB/CCG2220/1 at the Earlham Institute and the Earlham Institute Strategic Programme Grant Cellular Genomics BBX011070/1 and its constituent work packages BBS/E/ER/230001B (CellGen WP2 Consequences of somatic genome variation on traits). The work was also supported by the following response-mode project grants: BB/W006014/1 (Single molecule detection of DNA replication errors) and BB/Y00549X/1 (Single molecule analysis of Human DNA replication). This research was supported in part by NBI Research Computing through use of the High-Performance Computing system and Isilon storage.

About

GUI for single-molecule DNA/RNA base modification analysis, based off nanalogue

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors