CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

TermNorm is an AI-powered terminology normalization Excel add-in. It matches free-form text to standardized database identifiers using a three-tier approach: exact cache lookup → fuzzy matching → LLM research with web scraping.

Architecture: Vanilla JavaScript frontend (Office.js) + Python FastAPI backend

Version: See package.json (this field may be outdated)

Common Commands

Frontend Development

npm run dev-server          # Start webpack dev server (port 3000)
npm run build               # Production build
npm run build:iis           # Build for IIS deployment
npm run build:m365          # Build for Microsoft 365
npm test                    # Run Jest tests
npm run test:watch          # Watch mode
npm run test:coverage       # Generate coverage report
npm run lint                # Check with ESLint
npm run lint:fix            # Auto-fix lint issues
npm run start               # Debug in Excel desktop (F5 in VS Code)
npm run validate            # Validate manifest.xml

Backend Development

cd backend-api
python -m venv .venv
.\.venv\Scripts\activate    # Windows
pip install -r requirements.txt
python -m uvicorn main:app --reload                              # Local dev
python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload  # Network

Or use start-server-py-LLMs.bat for one-click startup.

Architecture

Frontend (src/)

core/: Event-driven state management
- state-store.js - Immutable state container with subscriber pattern
- event-bus.js - Pub/sub event system for loose coupling
- events.js - Event type definitions (MAPPINGS_LOADED, MATCH_LOGGED, TRACKING_CHANGED, SESSION_HISTORY_CHANGED, etc.)
- state-actions.js - Centralized state mutations (JSDoc typed)
services/: Business logic and data processing
- live-tracker.js - Excel cell change tracking, emits MATCH_LOGGED events
- normalizer.js - Three-tier matching pipeline (JSDoc typed)
- workflows.js - Async business logic: mappings, sessions, settings, tracking lifecycle (JSDoc typed)
- mapping-processor.js - Excel mapping file processor
matchers/: Matching algorithms
- matchers.js - Cache + fuzzy matching (single threshold, default 0.7) (JSDoc typed)
taskpane/: Main entry point (taskpane.js - Office.onReady, wizard state machine)
ui-components/: Reusable UI modules
- thermometer.js - Progress/status indicator with two modes
- candidate-ranking.js - Drag-to-rank candidate selection
- processing-history.js - Matching Journal view, listens for MATCH_LOGGED events
- direct-prompt.js - Custom LLM inference UI with fuzzy validation and candidate picker
- file-handling.js - Config file drag-and-drop
- mapping-config.js - Mapping configuration panel
- settings-panel.js - Settings UI
utils/: DOM and API helpers
- api-fetch.js - Backend API client + server utilities (JSDoc typed)
- dom-helpers.js - $(), showView(), modal helpers
- column-utilities.js - Column mapping builders (JSDoc typed)
- error-display.js - User-facing status messages
- settings-manager.js - Persistent settings storage
- status-indicators.js - LED indicators and status updates
- app-utilities.js - Version display, relevance colors
- history-cache.js - Processing history cache
config/: Configuration constants
- config.js - All constants, thresholds, JSDoc typedefs (MatchResult, CellState, MappingData)
design-system/: CSS architecture
- tokens.css - Color, spacing, typography variables
- utilities.css - Utility classes (hidden, flex, etc.)
- components.css - Badges, cards, buttons, forms

Backend (backend-api/)

main.py: FastAPI app entry point
api/: Route handlers (RESTful endpoints)
- research_pipeline.py - /sessions, /matches, /batches, /prompts, /activities
- system.py - /health, /status, /settings, /history, /cache
- experiments_api.py - /experiments/* for eval/optimization integration
- pipeline.py - /pipeline (with registry resolution via _enrich_with_registries()), /pipeline/trace, /pipeline/steps
core/: Infrastructure
- llm_providers.py - Unified Groq/OpenAI interface with retry logic
- logging.py - Backend logging configuration
- user_manager.py - IP-based user authentication
research_and_rank/: AI pipeline modules
- web_generate_entity_profile.py - Web scraping + entity extraction
- call_llm_for_ranking.py - LLM candidate ranking
- correct_candidate_strings.py - Fuzzy correction of LLM outputs
- display_profile.py - Entity profile formatting
- fuzzy_matching.py - rapidfuzz-based fuzzy matching (threshold 70, WRatio)
utils/:
- langfuse_logger.py - Langfuse-compatible logging
- prompt_registry.py - Versioned prompt management
- standards_logger.py - Experiment/run management
- cache_metadata.py - Cache metadata tracking
- responses.py - API response formatting
- utils.py - General utilities
- schema_registry.py - Versioned JSON schema management. Serves the pipeline resolution contract — _enrich_with_registries() reads registered schemas on-demand.
config/: Settings, middleware, users.json (hot-reload), pipeline.json (v1.1 — all tunable params)
logs/: Runtime data
- match_database.json - Persistent match cache
- langfuse/ - Langfuse-compatible logging (traces, observations, scores, datasets)
- prompts/ - Versioned LLM prompts (defaults committed to git, not runtime-initialized)
- schemas/ - Versioned JSON schemas (entity_profile, llm_ranking_output) — committed to git, resolved at request time by GET /pipeline

Web Search

Brave API → SearXNG → DuckDuckGo → Bing fallback chain. Toggle via USE_BRAVE_API=true/false in .env. Get key: https://api-dashboard.search.brave.com/register

Key Patterns

Event-Driven UI: Components react to events from event-bus (MAPPINGS_LOADED, CANDIDATES_AVAILABLE, MATCH_LOGGED)
Service/UI Boundary: Services emit events, UI listens. No direct imports from services→UI.
Unified State Store: All state lives in state-store.js
- Cell state: session.workbooks[workbookId].cells[cellKey]
- Mutations via state-actions.js functions
Centralized Config: All constants in config/config.js with JSDoc typedefs
Session-Based: No database - in-memory state with JSON persistence
Three-Tier Matching: Cache → Fuzzy → LLM (best result always written; 0.9 threshold is UI color only)
Workbook-Scoped Tracking: Multiple workbooks track cells independently
IP-Based Auth: Users configured in backend-api/config/users.json
Office.js Operations: Batch inside Excel.run(async (ctx) => {...}), commit with ctx.sync()
$ Helper Pattern: DOM queries via const $ = id => document.getElementById(id)
Thermometer Component: Progress indicator in persistent dashboard with two modes:
- progress: Sequential steps, collapsible, fill bar (setup wizard: server→config→mappings→activate)
- status: Independent toggleable states (research pipeline: web→LLM→score→rank)
Centralized Tracking Workflows: Tracking state managed via workflows.js with TRACKING_CHANGED events for reactive UI updates
Pipeline Composability: nodes + pipelines JSON format shared across backend, frontend, and PromptPotter. Backend exposes all tunable params via GET /pipeline (v1.1). LLMGeneration nodes carry schema_family/prompt_family references; _enrich_with_registries() resolves them from on-disk registries into top-level resolved_schemas/resolved_prompts dicts. This gives external consumers (PromptPotter) full visibility into field names, descriptions, template variables, and JSON schemas — no hardcoded metadata needed. Frontend owns local tiers and declares backend_pipeline: "default". /matches accepts only node_config — structured per-node dicts (e.g. {"entity_profiling": {"output_schema": {...}}}). Each LLM node supports prompt, output_schema, and model overrides. Flat params are not accepted. See docs/spec/README.md.
Pipeline Config Completeness: pipeline.json is the single source of truth for ALL tunable parameters. A config is not a list of what users change — it is a complete declaration of what the system assumes. If a value is a parameter of a node's implementation (threshold, limit, regex, model, etc.), it MUST be declared in that node's config — even if the current code hardcodes it and nobody tweaks it today. The config serves as a complete, discoverable description of each node's capabilities for anyone building pipelines on TermNorm. Never delete a parameter just because current code doesn't read it from config; instead, wire it. Never hardcode a fallback that should be configurable. No shadow defaults: pipeline functions must NOT carry parameter defaults that duplicate pipeline.json values — make config-sourced params required (no default). Shadow defaults create drift risk (e.g., max_sites=6 in function vs 7 in config). Use /audit-pipeline to surface hardcoded values, implicit library defaults, and domain assumptions.

Code Quality Standards

Maintainability: Code is organized into focused modules with clear responsibilities. Complexity is added only when needed.

Direct State Access: State accessed via state.server.online for simplicity. No getters/setters unless needed.

Central Coordination: taskpane.js orchestrates services while delegating specialized work to dedicated modules.

Type Definitions: Key functions have JSDoc types for IDE autocomplete. Shared types defined in config/config.js:

MatchResult - Normalization result (target, method, confidence, candidates, etc.)
CellState - Cell processing state (value, status, row, col, result)
MappingData - Forward/reverse mappings with metadata

Configuration Files

manifest.xml - Development manifest (localhost:3000)
manifest-iis.xml - IIS/network deployment
manifest-cloud.xml - Microsoft 365 deployment
config/app.config.json - Frontend runtime config (backend URL, column mappings)
backend-api/.env - Environment variables (API keys)
backend-api/config/users.json - IP-based user authentication (hot-reload)
backend-api/config/pipeline.json - Pipeline node configs, named pipelines, LLM defaults (v1.1)
src/config/pipeline.json - Frontend pipeline config with backend_pipeline reference

app.config.json Structure

{
  "backend_url": "http://127.0.0.1:8000",
  "excel-projects": {
    "Workbook.xlsx": {
      "column_map": {
        "InputColumn": { "output": "OutputColumn", "confidence": "ConfidenceColumn" }
      },
      "standard_mappings": [{
        "mapping_reference": "C:\\path\\to\\reference.xlsx",
        "worksheet": "Sheet1",
        "source_column": "SourceCol",
        "target_column": "TargetCol"
      }]
    }
  }
}

Column mapping structure: { "InputColumn": { "output": "OutputColumn", "confidence": "ConfidenceColumn" } }. The confidence field is optional.

Testing

Frontend tests are in __tests__/ directories adjacent to source files:

src/core/__tests__/ - State store and event bus tests

Run a single test file:

npm test -- src/core/__tests__/state-store.test.js

Data Flow

The TermNorm add-in follows a structured event-driven workflow:

App Initialization
    ↓
Configuration Loading (Drag & Drop or filesystem)
    ↓
Server Setup (backend-api venv + FastAPI on localhost:8000)
    ↓
Mapping Processing (Auto-load reference files + validate column mappings)
    ↓
Auto-Activate Live Tracking (ON/OFF toggle in dashboard)
    ↓
[User Input: Cell Entry + Enter]
    ↓
Normalization Pipeline
    ├─ 1. Quick lookup (cached)
    ├─ 2. Fuzzy matching
    └─ 3. LLM research (/research-and-match API)
    ↓
Results Display (Ranked candidates + status indicators)
    ↓
Optional: User Selection (Apply term → update target column)
    ↓
Logging (MATCH_LOGGED event → history + backend)

Langfuse-Compatible Logging

Backend logs to logs/langfuse/ in Langfuse-compatible format:

logs/langfuse/
├── traces/                    # Lean trace files (~10 lines)
├── observations/{trace_id}/   # Verbose step details (separate files)
├── scores/                    # Evaluation metrics
└── datasets/                  # Ground truth items

Key concepts:

Traces: Lean workflow summaries (input/output only)
Observations: Verbose step data in separate files (web_search, entity_profiling, etc.)
Dataset Items: Ground truth with source_trace_id linking TO traces
UserChoice/DirectEdit: Updates dataset item's expected_output

Trace IDs use datetime-prefixed format: YYMMDDHHMMSSxxxxxxxx...

See backend-api/docs/LANGFUSE_DATA_MODEL.md for full specification.

Known Limitations

Single Excel Instance Per Project: Each Excel file runs its own add-in instance with isolated state. Opening the same file twice creates two independent instances.

Development Notes

Backend print() statements are intentional: Colored console output in research_and_rank/ files provides developer-friendly pipeline visibility. Do not convert to logging module.
Archive folder: backend-api/.archive/ contains migration scripts needed until v1.3.0. Do not remove.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Common Commands

Frontend Development

Backend Development

Architecture

Frontend (src/)

Backend (backend-api/)

Web Search

Key Patterns

Code Quality Standards

Configuration Files

app.config.json Structure

Testing

Data Flow

Langfuse-Compatible Logging

Known Limitations

Development Notes

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Common Commands

Frontend Development

Backend Development

Architecture

Frontend (src/)

Backend (backend-api/)

Web Search

Key Patterns

Code Quality Standards

Configuration Files

app.config.json Structure

Testing

Data Flow

Langfuse-Compatible Logging

Known Limitations

Development Notes