Electron GUI for Nanalogue: interactive sequence data analysis and curation with a focus on single molecules and DNA/RNA modifications.
Nanalogue-gui provides a desktop application for working with BAM/CRAM/Mod-BAM files, with a focus on single-molecule DNA/RNA modifications. It builds on @nanalogue/node to provide interactive visualisation and curation workflows.
Nanalogue-gui is part of the Nanalogue family of tools for nanopore data analysis. Sister packages include nanalogue (Rust CLI and library), pynanalogue (Python wrapper), and @nanalogue/node (Node.js bindings).
- Node.js 22 or higher
You can download the release binaries from Github. Please look at the binaries attached to each release, and download the binary of your platform (macOS/Linux). For Windows, we recommend running our linux binary using the Windows subsystem for Linux (WSL). Please see this link or equivalent to learn about WSL.
NOTE: In Macs, if you download the binary, you may have to ignore a warning from Gatekeeper saying the developer is unknown. To avoid this warning, the authors here have to apply for an Apple developer license etc. so we have chosen to not do this. Please note that this project is open source, so you are free to inspect the source code here. You can always build from source to avoid such warnings.
To build from source, please use the command below.
You would first need to install utilities like npm and git.
git clone https://github.com/sathish-t/nanalogue-gui.git
cd nanalogue-gui
npm installAlternatively, if you only want the nanalogue-chat CLI (no Electron GUI),
clone the repo, build, and link it globally:
git clone https://github.com/sathish-t/nanalogue-gui.git
cd nanalogue-gui
npm install
npm run build
npm linkThis puts the nanalogue-chat command on your PATH from any directory. See the
CLI section for usage.
If you have installed the app from a binary, just launch the binary like you normally would. If you built it yourself, you can launch it from the command line like this:
npm startThis launches the landing page where you can choose between QC, Swipe, Locate Reads, and AI Chat modes. Three font-size buttons (small/medium/large) in the header scale all text in the app, including chart labels and legend.
| Mode | Entry point | Input | Output | Key use case |
|---|---|---|---|---|
| QC | GUI button | BAM/CRAM/Mod-BAM file or URL | Interactive charts (no file output) | Assess read quality, length distribution, modification patterns |
| Swipe | GUI button | BAM/CRAM + BED file (annotations) | Modified BED file with decisions | Accept/reject annotated features by visual inspection |
| Locate Reads | GUI button | BAM/CRAM + text file (read IDs) | BED6 file with coordinates | Find genomic positions of specific reads |
| AI Chat | GUI button or nanalogue-chat CLI |
BAM/CRAM directory + natural language question | LLM response + optional exported files | Ask complex questions; LLM generates and executes Python |
| nanalogue-sandbox-exec | nanalogue-sandbox-exec CLI |
Directory + Python script | Script output (stdout/files) | Run reproducible analysis scripts without LLM involvement |
Quality control analysis of BAM/CRAM/Mod-BAM files. Generates interactive charts covering read lengths, yield, analogue density, modification probabilities, and per-read sequences.
We demonstrate QC with a small BAM file containing simulated sequencing data.
The configuration screen allows setting:
- BAM/CRAM source (local file or URL)
- Modification filter (e.g.,
+T,-m,+a) - Genomic region (e.g.,
chrI:1000-50000) - Mod region to restrict modification filtering to a sub-region
- Read length histogram resolution (1 / 10 / 100 / 1,000 / 10,000 bp)
- Sample fraction (0.01%--100%) with deterministic seed
- Window size (2--10,000 bases of interest)
- Advanced options: MAPQ filters, read type filters, length filters, read ID file, base quality and probability thresholds
QC configuration screen:
QC configuration with a loaded BAM file:
QC result tabs:
- Read Lengths: histogram of aligned read lengths with summary statistics
- Yield Curve: cumulative yield by read count with total yield and N50
- Analogue Density: whole-read and windowed density histograms with optional range filters
- Raw Probability: modification probability distribution with optional range filter
- Sequences: per-read modification highlighting with quality tooltips, row selection, and read ID copy. Insertions and deletions are shown through lowercase bases and '.' respectively.
Read length distribution:
Yield:
Analogue density histogram:
Modification probability distribution:
Please note that this is from a simulated BAM file with artificial data.
Per-read modification sequences:
Interactive annotation curation. Displays modification signal plots for each annotation in a BED file, allowing the user to accept or reject each one.
The configuration screen allows setting:
- BAM/CRAM source (local file or URL)
- BED annotations file path (You need a BED file with at least four tab-separated columns: contig, start, end, read name)
- Output file path
- Modification filter (e.g.,
+T,-m) - Window size (for windowed density)
- Flanking region size (base pairs)
- Annotation highlight visibility
Controls:
- Right arrow or Accept button: accept the annotation
- Left arrow or Reject button: reject the annotation
Reviewing an annotation:
The grey points below are modification probabilities per genomic coordinate for which such information is available. The black line is windowed modification density i.e. we threshold the modification probability and then ask what fraction of bases are modified within each window.
Reviewing another annotation:
The grey and black annotations mean the same as above.
Converts a list of read IDs into a BED file by looking up their genomic coordinates in a BAM/CRAM file. Useful for finding where specific reads of interest map in the genome.
The configuration screen allows setting:
- BAM/CRAM source (local file or URL)
- Read ID file (plain text, one read ID per line)
- Region (optional, e.g.,
chr3orchrI:1000-50000) to speed up processing - Full region checkbox to restrict to reads that completely span the region
- Output BED file path
Output is tab-separated BED6 (contig, start, end, read_id, score, strand):
chr1 100 600 read_abc 1000 +
chr2 200 700 read_def 1000 -
After generation, a summary shows the number of BED entries written, read IDs not found in the BAM, and unmapped reads excluded from the output.
Experimental mode for asking natural-language questions about BAM files. Connects to any OpenAI-compatible API endpoint (local or remote) and queries BAM data in a sandboxed environment. For a detailed explanation of how AI Chat works under the hood, see documentation/ai-chat.md.
AI Chat works with any provider that exposes an OpenAI-compatible v1 API endpoint. This includes cloud providers like OpenAI, Anthropic, Google Gemini, Mistral, Together AI, Fireworks, and OpenRouter, as well as local runners like Ollama and LM Studio.
To use AI Chat, you need three things: an API endpoint URL, an API key
(if the provider requires one), and a model name.
Any other provider that supports the OpenAI v1 chat completions protocol
(POST /v1/chat/completions) will also work.
Please note that unless you have a local LLM that you or your organization are running,
it is likely that you will be charged per request to these URLs.
We have tested for the following providers either manually or automatically in this repository.
| Provider | Endpoint URL | API key | How to get started |
|---|---|---|---|
| OpenAI | https://api.openai.com/v1 |
required | Create an account and generate an API key at platform.openai.com/api-keys |
| Anthropic | https://api.anthropic.com/v1/ |
required | Create an account and generate an API key at console.anthropic.com |
https://generativelanguage.googleapis.com/v1beta |
required | Create an account and generate an API key at aistudio.google.com/apikey |
You can also start an LLM using a package like llama.cpp, and then use the URL.
In this scenario, you or your organization are running the LLM yourselves.
For example, this command starts a small Qwen LLM and exposes it at http://url_of_the_computer:8000.
You have to install llama.cpp, configure its parameters, choose a model etc.
The command shown below is for illustrative purposes only--the flags etc. may have
changed since the time of writing.
A bigger model on a computer with better hardware specs will generally give
better results. For a step-by-step guide covering installation, model
selection, and recommended flags, see
documentation/local-llm-setup.md.
./build/bin/llama-server \
-hf Qwen/Qwen3-8B-GGUF:Q4_K_M \
--jinja \
-ngl 99 \
-c 32768 \
-n 8192 \
--host 0.0.0.0 \
--port 8000These are other suggestions for LLM providers/setups that we haven't tested ourselves.
| Provider | Endpoint URL | API key | How to get started |
|---|---|---|---|
| Ollama (local) | http://localhost:11434/v1 |
not required | Install Ollama, then ollama pull <model> |
| LM Studio (local) | http://localhost:1234/v1 |
not required | Download a model from the LM Studio UI |
| OpenRouter | https://openrouter.ai/api/v1 |
required | Sign up and create a key at openrouter.ai/keys; gives access to many models |
| Together AI | https://api.together.xyz/v1 |
required | Sign up and create a key in the dashboard |
| Fireworks | https://api.fireworks.ai/inference/v1 |
required | Sign up and create a key in the dashboard |
Use the Fetch Models button after entering your endpoint URL and API key to see which models are available from your provider. For best results, choose a model that supports returning Python code, since AI Chat relies on Python code run in a sandbox to query your BAM data. Larger models generally give better answers but respond more slowly; smaller models are faster but may struggle with complex questions.
The configuration screen allows setting:
- BAM directory path
- API endpoint URL (defaults to
http://localhost:11434/v1for Ollama) - API key (if your provider requires one)
- Model name (with Fetch Models button for auto-discovery)
- Advanced options including sandboxed code execution (see documentation/advanced-options.md for a full reference)
NOTE: Depending on the provider and model you choose, your context lengths etc. could be different. If possible, please look up your model's parameters such as context length and change them using the Advanced options link in the screenshots below. For example, the GPT-5.2 model uses a 400K context window (see here) whereas Deepseek v2 lite uses a 32K context window (see here). A higher context window means the model can remember more of your chat, and can respond to longer prompts. Whether this is relevant for you depends on how you use the chat feature here.
AI Chat with a connected endpoint:
Asking a question about BAM data:
Multi-turn conversation:
Display sandboxed code:
We use a Python sandbox to receive code from the LLM and execute it. Our sandbox uses the Monty package from pydantic to run Python code in a secure way so that it has access only to our files and to specific functions. This ensures our sandbox is secure, and allows us to inspect what the LLM is doing. A copy button in the code panel lets you copy the Python code to your clipboard.
Inspect LLM instructions:
You can dump the full request payload sent to the LLM (system prompt and
conversation history) by typing /dump_llm_instructions in the chat input
or CLI REPL. The output is written as plain text to
ai_chat_output/ inside your BAM directory. At least one message must have
been sent to the LLM before the dump command will produce output.
Inspect the system prompt:
You can dump the static system prompt (the portion of the LLM instructions
that describes the sandbox capabilities, without the dynamic conversation
history) by clicking the View System Prompt button in the GUI, or by
typing /dump_system_prompt in the CLI REPL. The output is written to
ai_chat_output/ inside your BAM directory and is available at any point,
even before the first message has been sent. The button also shows a rough
token count (~N tokens) for the prompt.
Customise the system prompt:
Place a SYSTEM_APPEND.md file in your BAM directory to append additional
instructions to the default system prompt. The file is loaded once at session
start and its content is inserted after the built-in sandbox instructions.
Use this for small amounts of domain-specific context — for example, organism
background, project-specific conventions, or guidance on which modification
types to focus on. Files larger than 64 KB are silently ignored. Use
/dump_system_prompt to verify the full effective prompt after loading.
This feature is available in both the GUI and the CLI.
Run your own Python scripts:
You can bypass the LLM entirely and run a Python file directly in the sandbox
using the /exec command (available in both the GUI chat input and the CLI
REPL). The file must be a .py file inside your BAM directory. The script
runs with the same sandbox permissions as LLM-generated code (access to your
BAM files and the built-in helper functions), but the results are not sent to
the LLM conversation.
The same AI Chat analysis engine is available as a standalone terminal REPL, with no Electron or GUI needed.
End-users (installed via npm install -g nanalogue-gui):
nanalogue-chat --endpoint <url> --model <name> --dir <path>Developers (built from source):
node dist/cli.mjs --endpoint <url> --model <name> --dir <path>The remaining examples below use nanalogue-chat; developers substitute
node dist/cli.mjs.
For example, to chat about BAM files in ./data using a local Ollama model:
nanalogue-chat --endpoint http://localhost:11434/v1 --model llama3 --dir ./dataAuthentication: pass your API key with --api-key <key>, or set the
$API_KEY environment variable. Local runners like Ollama do not require a key.
Model discovery: use --list-models with an endpoint to see which models
are available:
nanalogue-chat --endpoint https://api.openai.com/v1 --api-key $API_KEY --list-modelsNon-interactive mode: use --non-interactive "<message>" to send a single
prompt, print the response, and exit — useful for scripting. You can combine
this with a custom SYSTEM_APPEND.md to describe complex tasks to the LLM
(see above).
nanalogue-chat --endpoint <url> --model <name> --dir <path> \
--non-interactive "What is the average read length?"System prompt customisation: use --system-prompt "<text>" to replace
the built-in sandbox prompt (the SYSTEM_APPEND.md file and facts array are
still appended to the prompt). Use --dump-llm-instructions --non-interactive "<msg>"
to write the full LLM request payload (system prompt + conversation) to a dated
log file in ai_chat_output/.
Remove tools: use --rm-tools "tool1,tool2" to disable specific external
functions (e.g. --rm-tools "write_file,read_file") — useful for restricting
sandbox capabilities.
REPL commands:
| Command | Action |
|---|---|
/new |
Start a new conversation |
/exec <file.py> |
Run a Python file directly in the sandbox without sending it to the LLM |
/dump_llm_instructions |
Dump the full LLM request payload from the last round to a log file in ai_chat_output/ |
/dump_system_prompt |
Dump the static system prompt to a log file in ai_chat_output/ |
/quit |
Exit the CLI |
| Ctrl+C during a request | Cancel the current request |
| Ctrl+C at the prompt | Exit |
For the full list of providers and endpoint URLs, see the Setting up a provider table above.
Disabling colour output: set NO_COLOR=1 to suppress ANSI colour codes —
useful when piping output to a file or script:
NO_COLOR=1 nanalogue-chat --endpoint <url> --model <name> --dir <path>Advanced options (context window size, timeouts, record limits) are available
via flags — run nanalogue-chat --help for a quick reference, or see
documentation/advanced-options.md for the
full reference including GUI equivalents.
nanalogue-sandbox-exec runs a Python script directly in the Monty sandbox
without involving an LLM. It has access to the same BAM helper functions as
AI Chat (e.g. read_info, bam_mods, write_file) and the same security
model (access restricted to the --dir directory), but the script and its
output are never sent to a language model. This makes it useful for batch
processing, reproducible analyses, and CI pipelines where LLM involvement is
not needed.
End-users (installed via npm install -g nanalogue-gui):
nanalogue-sandbox-exec --dir <path> <script.py>Developers (built from source):
node dist/execute-cli.mjs --dir <path> <script.py>--dir is required and must be the directory containing your BAM files.
The script path is resolved relative to the current working directory.
Output is written to stdout; errors are written to stderr. Exit codes:
0— sandbox ran successfully (stdout may be empty for silent scripts)1— bad arguments, file not found, or sandbox error
Example — run an analysis script and capture its output:
nanalogue-sandbox-exec --dir ./data analyse.py > results.txtSandbox limits and output size can be tuned with flags — run
nanalogue-sandbox-exec --help for a quick reference, or see
documentation/advanced-options.md for the
full reference.
# Build the project (produces dist/main.js, dist/renderer.js, dist/cli.mjs, and dist/execute-cli.mjs)
npm run build
# Run in development mode
npm run dev
# Run tests
npm test
# Lint (Biome, ESLint, Stylelint, html-validate)
npm run lint
# Auto-fix linting issues
npm run lint:fix
# TypeScript type checking
npx tsc --noEmitDevelopment environment:
- Node.js >= 22 is required to run the app
- Development and CI use Node 22 as the minimum version; Node 24 is also tested
- The package-lock.json is generated with Node 22's npm (v10) via a pre-commit hook, so npm ci works consistently across Node versions
- https://github.com/nvm-sh/nvm is required for development — the pre-commit hook uses nvm exec 22 to keep the lock file compatible
Note that the development environment notes are only relevant if you intend to play with the code or add features. They are not relevant if you want to just install and run the app. Just go to the installation section of this document.
The documentation/code-explainers/ folder contains short write-ups explaining
how specific parts of the codebase work — useful if you want to understand a
feature before diving into the source.
For detailed testing guidance (how to run tests, coverage enforcement, mocking
patterns), see documentation/testing.md.
We use Semantic Versioning.
Current Status: Pre-1.0 (0.x.y)
While in 0.x.y versions:
- The API may change without notice
- Breaking changes can occur in minor version updates
After 1.0.0, we will guarantee backwards compatibility in minor/patch releases.
See CHANGELOG.md for version history.
MIT License - see LICENSE for details.
This software was developed at the Earlham Institute in the UK. This work was supported by the Biotechnology and Biological Sciences Research Council (BBSRC), part of UK Research and Innovation, through the Core Capability Grant BB/CCG2220/1 at the Earlham Institute and the Earlham Institute Strategic Programme Grant Cellular Genomics BBX011070/1 and its constituent work packages BBS/E/ER/230001B (CellGen WP2 Consequences of somatic genome variation on traits). The work was also supported by the following response-mode project grants: BB/W006014/1 (Single molecule detection of DNA replication errors) and BB/Y00549X/1 (Single molecule analysis of Human DNA replication). This research was supported in part by NBI Research Computing through use of the High-Performance Computing system and Isilon storage.














