Skip to content

Commit 846b849

Browse files
committed
Update investigation with openms-insight research
Add comprehensive documentation of openms-insight (t0mdavid-m/openms-insight): Vue.js-based Streamlit custom components for interactive MS data visualization with cross-component linked selection, multi-resolution downsampling, server-side pagination, and automatic disk caching. Includes visualization decision guide comparing pyopenms-viz vs openms-insight use cases. https://claude.ai/code/session_011dutf865eEBg2bXnbBa7N9
1 parent 4ddcd22 commit 846b849

File tree

1 file changed

+86
-15
lines changed

1 file changed

+86
-15
lines changed

docs/webapp-agent-investigation.md

Lines changed: 86 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,54 @@ df.plot(kind="mobilogram") # Ion mobility
8484
```
8585
Outputs standard Plotly/Bokeh/matplotlib figures compatible with `st.plotly_chart()`.
8686

87-
### openms-insight
88-
Not found as a public repository or package. May be planned/unreleased or known by a different name. The closest analogues are the existing OpenMS WebApps (TOPPView Lite, StreamSage, UmetaFlow, etc.). **Clarification from the team is needed** on what this refers to.
87+
### openms-insight (v0.1.13+)
88+
[GitHub: t0mdavid-m/openms-insight](https://github.com/t0mdavid-m/openms-insight) — A Python library providing **interactive Vue.js-based Streamlit custom components** for mass spectrometry data visualization. Created by Tom David Muller (Kohlbacher Lab, co-lead author of the OpenMS WebApps paper). Installable via `pip install openms-insight`.
89+
90+
**Five visualization components:**
91+
92+
| Component | Technology | Purpose |
93+
|-----------|-----------|---------|
94+
| **Table** | Tabulator.js | Server-side paginated, filterable, sortable tables with CSV export and custom formatters (scientific, signed, badge) |
95+
| **LinePlot** | Plotly.js | Stick-style mass spectrum visualization with peak highlighting, selection, annotations, and SVG export |
96+
| **Heatmap** | Plotly scattergl | 2D scatter plots with multi-resolution cascading downsampling, categorical coloring, zoom-based level selection |
97+
| **VolcanoPlot** | Plotly.js | Differential expression visualization with adjustable significance thresholds, three-category coloring |
98+
| **SequenceView** | Custom Vue | Peptide sequence display with fragment ion matching (uses pyopenms `TheoreticalSpectrumGenerator`), amino acid modification rendering |
99+
100+
**Key architectural features:**
101+
- **Cross-component linked selection** via `StateManager` — clicking a row in a Table highlights the corresponding point in a Heatmap or LinePlot
102+
- **Declarative filter/interactivity mapping** — components declare `filters={"key": "column"}` and `interactivity={"key": "column"}` for linkage
103+
- **Multi-resolution downsampling** — cascading spatial binning for million-point heatmaps (smooth zooming)
104+
- **Server-side pagination** — only current page sent to browser, enabling millions-of-rows tables
105+
- **Subprocess preprocessing** — heavy computation in spawned processes so memory is freed
106+
- **Automatic disk caching** — preprocessed data saved to Parquet with config-hash invalidation
107+
- **Cache reconstruction** — components reinstantiated from cache without re-specifying data
108+
109+
**Tech stack:** Python + Polars (backend preprocessing) → Vue 3 + Pinia + Vuetify + Plotly.js + Tabulator.js (frontend)
110+
111+
**Usage pattern:**
112+
```python
113+
from openms_insight import Table, Heatmap, LinePlot, StateManager
114+
import polars as pl
115+
116+
data = pl.scan_parquet("features.parquet")
117+
state = StateManager()
118+
119+
table = Table(
120+
cache_id="feature_table", data=data,
121+
filters={"selected": "feature_id"},
122+
interactivity={"selected": "feature_id"},
123+
)
124+
heatmap = Heatmap(
125+
cache_id="feature_map", data=data,
126+
filters={"selected": "feature_id"},
127+
x_col="RT", y_col="mz", value_col="intensity",
128+
)
129+
130+
table() # Render table
131+
heatmap() # Render heatmap — linked to table selections
132+
```
133+
134+
**Relationship to pyopenms-viz:** While pyopenms-viz provides single-line DataFrame plotting (matplotlib/Plotly/Bokeh backends), openms-insight provides richer interactive components with cross-component state, server-side pagination, and caching. They are complementary — pyopenms-viz for quick static/simple interactive plots, openms-insight for complex interactive dashboards with large datasets.
89135

90136
---
91137

@@ -190,8 +236,10 @@ async for message in query(
190236
│ └── SKILL.md # How to create TOPP-based workflows
191237
├── pyopenms-tools/
192238
│ └── SKILL.md # How to use pyopenms in Streamlit
193-
└── visualization/
194-
└── SKILL.md # How to use pyopenms-viz
239+
├── visualization/
240+
│ └── SKILL.md # How to use pyopenms-viz
241+
└── openms-insight/
242+
└── SKILL.md # How to use openms-insight interactive components
195243
```
196244

197245
### Approach B: Streamlit Chat Interface with LLM Backend
@@ -315,6 +363,11 @@ Placed in `.claude/skills/` for progressive disclosure:
315363

316364
### Visualization with pyopenms-viz
317365
[Examples using ms_plotly backend with st.plotly_chart()]
366+
367+
### Interactive Dashboards with openms-insight
368+
[Examples using Table, Heatmap, LinePlot, VolcanoPlot, SequenceView]
369+
[StateManager for cross-component linked selection]
370+
[When to use openms-insight vs pyopenms-viz]
318371
```
319372

320373
### 5.3 Tool Parameter Database
@@ -426,9 +479,10 @@ pytest tests/
426479
**Deliverables**:
427480
1. `.claude/skills/openms-webapp-builder/SKILL.md` — Core patterns and rules
428481
2. `.claude/skills/topp-tools/SKILL.md` — TOPP tool reference with parameters
429-
3. `.claude/skills/pyopenms-viz/SKILL.md` — Visualization patterns
430-
4. `tools/topp_tool_registry.json` — Machine-readable TOPP tool database
431-
5. `tools/workflow_templates/` — Example workflow specifications
482+
3. `.claude/skills/pyopenms-viz/SKILL.md` — Visualization patterns (simple/quick plots)
483+
4. `.claude/skills/openms-insight/SKILL.md` — Interactive component patterns (complex dashboards)
484+
5. `tools/topp_tool_registry.json` — Machine-readable TOPP tool database
485+
6. `tools/workflow_templates/` — Example workflow specifications
432486

433487
### Phase 2: Agent Prototype — Single-Agent with Claude Agent SDK
434488

@@ -484,27 +538,28 @@ Agent: Your app is ready! It includes:
484538

485539
### Open Questions
486540

487-
1. **What is "openms-insight"?** Not found as a public tool. Needs clarification — is it planned, internal, or known by another name?
488-
489-
2. **Target users**: Are the end users bioinformaticians comfortable with CLI, or do they need a fully web-based experience? This determines whether Approach A (SDK) or Approach C (hybrid) is better.
541+
1. **Target users**: Are the end users bioinformaticians comfortable with CLI, or do they need a fully web-based experience? This determines whether Approach A (SDK) or Approach C (hybrid) is better.
490542

491-
3. **Deployment model**: Will generated apps run locally, in Docker, or on a shared server? This affects the preview/testing strategy.
543+
2. **Deployment model**: Will generated apps run locally, in Docker, or on a shared server? This affects the preview/testing strategy.
492544

493-
4. **TOPP tool availability**: The agent needs access to TOPP tool binaries to generate `.ini` files and test workflows. This requires the full Docker image (`Dockerfile`, not `Dockerfile_simple`).
545+
3. **TOPP tool availability**: The agent needs access to TOPP tool binaries to generate `.ini` files and test workflows. This requires the full Docker image (`Dockerfile`, not `Dockerfile_simple`).
494546

495-
5. **Scope of generation**: Should the agent generate:
547+
4. **Scope of generation**: Should the agent generate:
496548
- (a) Complete standalone apps from scratch?
497549
- (b) New workflows/pages within the existing template?
498550
- (c) Both, depending on complexity?
499551

500-
6. **Model selection**: Claude Opus for planning/reviewing, Claude Sonnet for code generation (faster, cheaper), Claude Haiku for simple validation tasks?
552+
5. **Model selection**: Claude Opus for planning/reviewing, Claude Sonnet for code generation (faster, cheaper), Claude Haiku for simple validation tasks?
553+
554+
6. **Visualization library choice**: When should generated apps use pyopenms-viz (simple, single-line plots) vs openms-insight (interactive dashboards with linked components, large dataset support)? The agent needs clear heuristics — e.g., use openms-insight when the app needs cross-component selection, server-side pagination, or million-point heatmaps.
501555

502556
### Risks
503557

504558
| Risk | Mitigation |
505559
|------|------------|
506560
| Generated code has incorrect TOPP tool parameters | Validate against `.ini` files; use `get_topp_params` tool |
507561
| Hallucinated pyopenms API calls | Skill files with verified examples; reviewer agent checks |
562+
| Incorrect openms-insight component config | Validate filter/interactivity mappings against DataFrame columns |
508563
| Streamlit session state conflicts | Template enforces naming conventions; validation checks |
509564
| Context window exhaustion on complex apps | Claude Agent SDK context compaction; break into sub-tasks |
510565
| Preview launch failures | Headless mode; error capture; fallback to syntax-only check |
@@ -516,7 +571,8 @@ Agent: Your app is ready! It includes:
516571
- **Anthropic API key**: Required for agent operation
517572
- **OpenMS TOPP tools**: Required for `.ini` generation and workflow testing
518573
- **pyopenms**: Required for parameter validation
519-
- **pyopenms-viz**: Required for visualization code generation
574+
- **pyopenms-viz**: Required for simple visualization code generation
575+
- **openms-insight**: Required for interactive dashboard generation (`pip install openms-insight`)
520576

521577
---
522578

@@ -551,3 +607,18 @@ These serve as reference implementations the agent can learn from:
551607
- **NuXL** — Cross-linking analysis
552608
- **NASEWEIS** — Oligonucleotide MS analysis
553609
- **MHCQuant** — Immunopeptidomics
610+
611+
## Appendix D: Visualization Library Decision Guide
612+
613+
| Criterion | pyopenms-viz | openms-insight |
614+
|-----------|-------------|----------------|
615+
| **Complexity** | Single-line `.plot()` calls | Component instantiation with config |
616+
| **Interactivity** | Basic (Plotly zoom/pan) | Rich (cross-component linking, selection) |
617+
| **Large datasets** | Limited by browser memory | Multi-resolution downsampling, server-side pagination |
618+
| **Plot types** | Chromatogram, spectrum, peakmap, mobilogram | Table, LinePlot, Heatmap, VolcanoPlot, SequenceView |
619+
| **Backend** | Plotly, Bokeh, matplotlib | Vue.js + Plotly.js + Tabulator.js |
620+
| **State management** | None | StateManager with cross-component sync |
621+
| **Caching** | None | Automatic disk caching with hash invalidation |
622+
| **Best for** | Quick exploratory plots, simple apps | Interactive dashboards, production apps, large datasets |
623+
624+
**Agent heuristic**: Default to pyopenms-viz for simple visualization pages. Switch to openms-insight when the user needs: (a) linked selection across components, (b) tables with >10K rows, (c) heatmaps with >100K points, (d) peptide sequence/fragment visualization, or (e) differential expression volcano plots.

0 commit comments

Comments
 (0)