Skip to content

Commit fd76a32

Browse files
docs(07): restructure phases 7.1-7.3 for LLM KG builder and D3.js frontend
- Phase 7.1: LLM-based KG Builder Agent (replaces programmatic approach) - Phase 7.2: D3.js enhancement with Epstein-inspired layout and panels - Phase 7.3: vis-network (deferred/optional) - REQ-VIS-003 rewritten for D3.js with local KG canvas panels - REQ-AGENT-009 revised for LLM-based entity/relationship extraction - Add Epstein Doc Explorer frontend reference document - Source viewer replaces document modal (multi-media, coexists with timeline) - Selection highlighting: white connections, dim unconnected
1 parent 4131f7a commit fd76a32

File tree

4 files changed

+594
-137
lines changed

4 files changed

+594
-137
lines changed

.planning/REQUIREMENTS.md

Lines changed: 74 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -533,27 +533,32 @@ This document defines formal requirements for Holmes v1. Requirements are derive
533533
- Uses `include_thoughts=True` for transparency
534534
**Dependencies:** REQ-STORE-001, REQ-AGENT-003, REQ-AGENT-004, REQ-AGENT-005, REQ-AGENT-006
535535

536-
### REQ-AGENT-009: Knowledge Graph Builder Service
536+
### REQ-AGENT-009: Knowledge Graph Builder Agent (LLM-Based)
537537
**Priority:** HIGH
538-
**Description:** Programmatic Python service (NOT an LLM agent) that extracts entities and relationships from domain agent structured output and stores them in KG tables.
539-
**Acceptance Criteria:**
540-
- Reads domain agent structured output from `agent_executions.output_data` (Pydantic models)
541-
- Extracts ALL entities from `DomainEntity` lists across all domain agents
542-
- Creates relationship edges from findings (entities co-occurring in same finding)
543-
- Entity types from all domain taxonomies:
544-
- Core: Person, Organization, Event, Document, Location, Amount
545-
- Financial: monetary_amount, account, transaction, asset, financial_instrument, tax_record
546-
- Legal: statute, case_citation, contract, legal_term, court, obligation, party, clause
547-
- Evidence: communication, alias, vehicle, property, timestamp, physical_evidence, digital_artifact, witness
548-
- Strategy: strategic_decision, organizational_unit, stakeholder, objective, risk_factor
549-
- Entity deduplication: exact name+type match → auto-merge; fuzzy matching (>85% similarity) → flag for LLM resolution
550-
- Additive-only: NEVER filters or discards entities/relationships from domain agent output
538+
**Description:** LLM-based agent (Gemini Pro) that reads ALL domain agent findings and entities holistically, then produces a curated knowledge graph with deduplicated high-level entities and semantic relationships. Replaces the previous programmatic co-occurrence approach.
539+
**Acceptance Criteria:**
540+
- Receives ALL case_findings (rich markdown with citations) + all raw DomainEntity lists from agent_executions.output_data + case description
541+
- Gemini 3 Pro with `thinking_level="high"` and 1M context window
542+
- Runs in fresh stage-isolated ADK session after ALL domain agents complete
543+
- Produces curated entity list with investigation-focused taxonomy:
544+
- **PERSON** — Named individuals (suspects, witnesses, victims, officers)
545+
- **ORGANIZATION** — Companies, agencies, groups, shell entities
546+
- **LOCATION** — Physical places, addresses, jurisdictions
547+
- **EVENT** — Specific occurrences with dates (transactions, meetings, communications)
548+
- **ASSET** — Properties, vehicles, investments, digital wallets
549+
- **FINANCIAL_ENTITY** — Bank accounts, transactions, instruments
550+
- **COMMUNICATION** — Phone calls, emails, messages, documents exchanged
551+
- **DOCUMENT** — Key evidence items referenced across findings
552+
- Timestamps, monetary amounts, physical objects → metadata on entities/relationships, NOT standalone graph nodes
553+
- Produces semantic relationships with typed labels (e.g., "employed_by", "transferred_funds_to", "owns") — NOT co-occurrence
554+
- Entity deduplication handled naturally by LLM seeing all findings together (cross-agent alias resolution)
555+
- Cross-domain relationship inference (financial finding ↔ legal finding connected when semantically related)
556+
- Every entity includes aliases array (all known name variants from different agents)
557+
- Every relationship includes evidence_excerpt (exact source text) and source_finding_ids for traceability
558+
- Stores curated output in existing kg_entities and kg_relationships tables (clears old data for workflow, writes curated data)
551559
- Degree computation (connection counts) for frontend node sizing
552-
- Stores in PostgreSQL: kg_entities, kg_relationships tables
553-
- All nodes linked to source execution (source_execution_id) and source finding (source_finding_index)
554-
- Runs AFTER each domain agent completes (progressive KG population)
555-
- Final entity deduplication pass after ALL domain agents complete
556-
- Incremental updates without full rebuild when new files analyzed
560+
- Inline Pro-to-Flash fallback for resilience
561+
- Raw entities from domain agents preserved in agent_executions.output_data for audit trail
557562
**Dependencies:** REQ-STORE-001, REQ-AGENT-003, REQ-AGENT-004, REQ-AGENT-005, REQ-AGENT-006
558563

559564
### REQ-AGENT-010: Incremental Processing
@@ -671,32 +676,51 @@ This document defines formal requirements for Holmes v1. Requirements are derive
671676

672677
### REQ-VIS-003: Knowledge Graph View
673678
**Priority:** HIGH
674-
**Description:** Premium knowledge graph visualization using vis-network with intelligent clustering, physics-based layout, and relationship-aware spacing.
675-
**Acceptance Criteria:**
676-
- **vis-network** library (replacing D3.js) integrated via `useRef`/`useEffect` for full TypeScript control
677-
- ForceAtlas2-based physics simulation:
678-
- Solver: `forceAtlas2Based` for superior cluster formation in investigative graphs
679-
- Configurable gravitationalConstant, springLength, springConstant, centralGravity
680-
- `avoidOverlap` enabled to prevent node collision
681-
- Stabilization: 200-500 iterations with fit-after-stabilize
682-
- Group-based entity type clustering:
683-
- Nodes assigned to groups by entity type (person, organization, location, event, document, evidence)
684-
- Each group has distinct shape, color, size, and icon
685-
- Natural spatial clustering — related entities closer, unrelated further apart
686-
- Relationship-type-based edge configuration:
687-
- Edge length varies by relationship type (e.g., EMPLOYED_BY shorter than MENTIONED_IN)
688-
- Edge width indicates relationship strength
689-
- Edge color indicates relationship category
690-
- Labeled edges with relationship type
691-
- Nodes sized by connection count (degree)
692-
- Five toggleable layers: Evidence (red), Legal (blue), Strategy (green), Temporal (amber), Hypothesis (pink)
693-
- Zoom and pan controls
694-
- Node search and highlight (find entity by name, highlight path)
695-
- Click node to see details: entity metadata, full citation chain, connected entities, source agent
679+
**Description:** Premium knowledge graph visualization using D3.js force simulation with Epstein Doc Explorer-inspired layout, physics, filtering panels, entity timeline, and multi-media source viewer — adapted to Holmes's Liquid Glass design system. Filter/control and timeline panels are local to the KG canvas, not in the app-wide sidebar.
680+
**Acceptance Criteria:**
681+
- **D3.js** force simulation with 5 forces (link, charge, center, collision, radial) inside React `useRef`/`useEffect`
682+
- Radial force: high-connection entities near center, low-connection pushed outward (connection-count-based gravity well)
683+
- Sqrt-scaled node radius (connection count → 5-100px range via `d3.scalePow().exponent(0.5)`)
684+
- Collision detection using actual circle radius + padding
685+
- Link distance: constant base (50px) or relationship-type-based
686+
- Charge repulsion: -400 (tunable)
687+
- Domain-colored SVG circle nodes (person=orange, org=green, location=blue, event=amber — consistent with Holmes palette)
688+
- Node labels below circles, entity type indicated by color
689+
- Click node → highlight node + all connected edges (white), dim unconnected edges; open right sidebar timeline
690+
- Drag individual nodes (fix position during drag, release on drop)
691+
- Zoom/pan with `d3.zoom()` (scale extent [0.01, 10])
692+
- **Left panel — Filters & Controls** (local to KG canvas, NOT in the app-wide sidebar):
693+
- Positioned on the left side of the knowledge graph canvas area
694+
- Selected entity display with Clear button
695+
- Graph stats: entity count, relationship count, domain breakdown
696+
- Entity search (debounced text input, highlights matching nodes)
697+
- Keyword filter (comma-separated fuzzy match against relationship labels/entity names)
698+
- Domain layer toggles (Financial, Legal, Evidence, Strategy) with select/deselect all
699+
- Density threshold slider (prune low-connection nodes by percentage of average)
700+
- **Right panel — Entity Timeline** (local to KG canvas):
701+
- Appears when entity is selected
702+
- Chronological list of relationships involving selected entity
703+
- Each entry: year/date, relationship description (actor → action → target), source citation reference
704+
- Source citations list acts as navigation for the source viewer panel (click a citation → jump to that excerpt/timestamp)
705+
- Scrollable timeline with entity names highlighted in accent colors
706+
- Stays visible when source viewer is open (does not get hidden)
707+
- **Source viewer panel** (replaces simple "document excerpt modal" — details to be refined during phase discussions):
708+
- Multi-media: renders content based on source type (document text, video player, audio player, image viewer)
709+
- For documents: full text excerpt with entity names highlighted (selected entity = yellow, related = orange)
710+
- For audio/video: playback with timestamp navigation from right panel citations
711+
- For images: viewer with annotation overlay capability
712+
- Opens alongside (not replacing) the right panel — right panel citations serve as a navigable index
713+
- Source metadata header (summary, category, date range)
714+
- Close button (X)
715+
- NOTE: This component is reusable beyond the KG view (Evidence Library, Timeline, etc.) — full specification during phase discussions
716+
- Edge deduplication: multiple relationships between same entity pair → single edge with count
717+
- Edge hover tooltip: relationship label, temporal context, source document
696718
- Fullscreen capability with maximize button
697719
- Basic analytics: node count, edge count, most connected entities
698-
- Lazy clustering for graphs with >200 nodes
720+
- Bottom instruction bar: "Click nodes to explore relationships · Scroll to zoom · Drag to pan"
721+
- Responsive layout within KG canvas: left panel (320px) | center graph | right panel (384px, appears on select) — all local to the KG page content area, not the app-wide sidebar
699722
- Intuitive at first glance: a user can understand entity relationships without instruction
723+
- **Reference:** `DOCS/reference/epstein-network-ui/` for layout, physics, and interaction patterns
700724
**Dependencies:** REQ-AGENT-009, REQ-STORE-001
701725

702726
### REQ-VIS-004: Timeline View
@@ -1571,13 +1595,17 @@ No confirmation dialogs implemented yet.
15711595

15721596
| Sub-Criterion | Status | Notes |
15731597
|---------------|--------|-------|
1574-
| Force-directed graph | | D3.js chosen and implemented |
1598+
| D3.js force simulation (5 forces) | 🟠 | Basic D3.js graph exists; needs radial force, collision, charge tuning (Phase 7.2) |
15751599
| Nodes sized by connection count || Implemented |
15761600
| Edges labeled with relationship type || Implemented |
1577-
| Five toggleable layers | 🟠 | Layer concept exists but not 5-layer system yet |
1601+
| Domain layer toggles | 🟠 | Layer concept exists but not domain-based toggle system yet |
15781602
| Zoom and pan controls || Full zoom/pan/reset controls |
15791603
| Node search and highlight || Implemented |
15801604
| Click node for details || Info panel shows on click |
1605+
| Left panel (filters/controls, local to KG canvas) || Not implemented (Phase 7.2) |
1606+
| Right panel (entity timeline, local to KG canvas) || Not implemented (Phase 7.2) |
1607+
| Source viewer panel (multi-media) || Not implemented (Phase 7.2+) — details during phase discussions |
1608+
| Density threshold slider || Not implemented (Phase 7.2) |
15811609
| Fullscreen capability || Not implemented |
15821610
| Basic analytics || Not implemented |
15831611

@@ -1809,7 +1837,8 @@ Limitations documented in code comments and mitigated:
18091837

18101838
*Generated: 2026-01-18*
18111839
*Updated: 2026-02-07*
1812-
*Architecture redesign: 2026-02-07 (REQ-STORE added, REQ-AGENT-008/009 rewritten, REQ-VIS-003 updated for vis-network, REQ-CHAT-002/003 updated for tool-based architecture)*
1840+
*Architecture redesign: 2026-02-07 (REQ-STORE added, REQ-AGENT-008/009 rewritten, REQ-CHAT-002/003 updated for tool-based architecture)*
1841+
*Architecture revision: 2026-02-08 (REQ-AGENT-009 revised for LLM-based KG Builder; REQ-VIS-003 updated for D3.js with Epstein-inspired patterns; vis-network deferred)*
18131842
*Status: Complete - Integration features added (REQ-RESEARCH, REQ-HYPO, REQ-GEO, REQ-TASK)*
18141843
*Frontend Status: Partial implementation by Yatharth (see DEVELOPMENT_DOCS/YATHARTH_WORK_SUMMARY.md)*
18151844
*Phase 4 Agent requirements tracked: 2026-02-03*

0 commit comments

Comments
 (0)