Dual-layer graph store with SQLite persistence and NetworkX algorithms, D3.js force-directed visualization, automatic planner context injection (guardrail), LLM-driven entity management, and multi-profile management UI — all scoped per
(profile_id, user_uuid)for multi-user isolation.
The Knowledge Graph component (TDA_KnowledgeGraph) captures database topology, business concepts, metrics, and domain taxonomies as a typed entity-relationship graph. Unlike static schema documentation, the knowledge graph is a living, queryable model that grows as users interact with the platform — through LLM inference, manual enrichment, bulk import, or (future) MCP schema auto-discovery.
The component's primary value is planner context injection: before every strategic planning call, the system extracts a relevant subgraph based on the user's query and injects it into the LLM prompt. This guides tool selection, SQL construction, and argument generation — functioning as a semantic guardrail that reduces hallucination and improves first-attempt success rates.
The knowledge graph is scoped per (profile_id, user_uuid), meaning each profile builds its own domain model in isolation. This aligns with the platform's multi-profile architecture where different profiles connect to different MCP servers and data sources.
- Design Philosophy
- File Structure
- System Architecture
- Data Model
- Backend Architecture
- Component Handler
- Frontend Architecture
- REST API
- Integration Points
- MCP Schema Discovery (V2 Stub)
- Security & Isolation
- Performance Characteristics
- Roadmap
- File Reference
SQLite provides durable, transactional CRUD with multi-user isolation. NetworkX provides graph algorithms (BFS, shortest path, centrality, cycle detection) without reimplementing graph theory. The NetworkX layer is lazy-loaded from SQLite and cached — invalidated on any write operation.
The knowledge graph enriches the LLM's context but does not constrain it. At medium intensity, graph context is advisory; at heavy intensity, the LLM is instructed to strictly validate against known relationships. The LLM retains autonomy to deviate when the graph is incomplete.
Each (profile_id, user_uuid) pair owns an independent graph. A profile connected to a sales database builds a different knowledge graph than one connected to an HR system. Cross-profile enumeration is read-only for management UI purposes.
Graphs start empty and grow through three channels:
- LLM-inferred: The LLM calls
add_entity/add_relationshipduring conversation - Manual: Users import JSON files or use the REST API
- Auto-discovery (V2): MCP tool schemas parsed to extract topology
The component handler declares is_deterministic = True. When the LLM calls TDA_KnowledgeGraph, the response is computed directly from the graph store without an additional LLM call — saving tokens and latency.
components/builtin/knowledge_graph/
├── manifest.json # Component metadata, tool definition, render targets
├── handler.py # KnowledgeGraphHandler — process() + context enrichment
├── graph_store.py # GraphStore — dual-layer SQLite + NetworkX engine
├── renderer.js # D3.js v7 force-directed visualization (~1084 lines)
├── instructions.json # Intensity-keyed LLM guidance (none/medium/heavy)
├── discovery.py # MCPSchemaDiscovery — V2 stub for auto-population
└── __init__.py
schema/
└── 21_knowledge_graph.sql # SQLite table definitions + indexes
static/js/
├── api.js # Frontend API functions (loadKnowledgeGraphList, etc.)
└── handlers/
└── knowledgeGraphPanelHandler.js # Management UI for Resource Panel + Intelligence page
templates/
└── index.html # Split panel DOM (#kg-split-panel), Intelligence tab
┌─────────────────────────────────────────────────────────────────────────────┐
│ KNOWLEDGE GRAPH DATA FLOW │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ User Query │ │ REST API │ │ JSON Import │ │
│ │ (session) │ │ (manual) │ │ (file) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ GraphStore(profile_id, user_uuid) │ │
│ │ ┌───────────────────┐ ┌─────────────────────────┐ │ │
│ │ │ SQLite Layer │ │ NetworkX DiGraph │ │ │
│ │ │ ───────────── │ │ ────────────────── │ │ │
│ │ │ kg_entities │──▶ Lazy-loaded on first │ │ │
│ │ │ kg_relationships│ │ graph algorithm call │ │ │
│ │ │ │ │ │ │ │
│ │ │ CRUD, search, │ │ BFS, shortest path, │ │ │
│ │ │ persistence │ │ centrality, cycles │ │ │
│ │ └───────────────────┘ └─────────────────────────┘ │ │
│ └──────────────────────────┬──────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌──────────────┐ ┌───────────────┐ │
│ │ Context │ │ Tool Call │ │ Visualization│ │
│ │ Enrichment │ │ Response │ │ Spec │ │
│ │ (guardrail) │ │ (inline) │ │ (D3 graph) │ │
│ └───────┬───────┘ └──────┬───────┘ └───────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌──────────────┐ ┌───────────────┐ │
│ │ Planner │ │ Chat │ │ Split Panel │ │
│ │ Prompt │ │ Message │ │ D3 Force │ │
│ │ Injection │ │ (HTML) │ │ Graph │ │
│ └───────────────┘ └──────────────┘ └───────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────┐
│ GraphStore Instance │
│ (profile_id, user_uuid scoped) │
│ │
│ ┌────────────────┐ ┌────────────────────┐ │
│ │ SQLite Layer │ │ NetworkX Layer │ │
│ │ │ │ │ │
│ │ add_entity() │ │ extract_subgraph_ │ │
│ │ get_entity() │ │ adaptive() │ │
│ │ search() │ │ shortest_path() │ │
│ │ list() │ │ centrality() │ │
│ │ update() │ │ detect_cycles() │ │
│ │ delete() │ │ connected() │ │
│ │ import_bulk() │ │ get_stats() │ │
│ │ clear_graph() │ │ │ │
│ │ │ │ Lazy-loaded, │ │
│ │ ACID durable │ │ cached, │ │
│ │ Multi-user │ │ invalidated on │ │
│ │ │ │ any write │ │
│ └────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────┘
│
Write ops ────────────┘
(add/update/delete)
trigger _invalidate_cache()
→ self._graph = None
→ Next algorithm call
re-loads from SQLite
Table: kg_entities (SQLite, tda_auth.db)
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-incrementing entity ID |
profile_id |
TEXT NOT NULL | Profile scope |
user_uuid |
TEXT NOT NULL | User scope |
name |
TEXT NOT NULL | Entity display name |
entity_type |
TEXT NOT NULL | One of 8 enumerated types |
properties_json |
TEXT | JSON blob: description, data_type, business_meaning, etc. |
source |
TEXT NOT NULL | Origin: manual, mcp_discovery, llm_inferred |
source_detail |
TEXT | Optional metadata (MCP server ID, tool name) |
created_at |
TIMESTAMP | UTC creation time |
updated_at |
TIMESTAMP | UTC last modification |
Unique constraint: (profile_id, user_uuid, name, entity_type)
Indexes:
idx_kg_entities_profile_user— Fast profile/user filteringidx_kg_entities_type— Type-filtered queriesidx_kg_entities_name— Case-insensitive name lookup (COLLATE NOCASE)
Table: kg_relationships (SQLite, tda_auth.db)
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-incrementing relationship ID |
profile_id |
TEXT NOT NULL | Profile scope |
user_uuid |
TEXT NOT NULL | User scope |
source_entity_id |
INTEGER FK | Source node (CASCADE delete) |
target_entity_id |
INTEGER FK | Target node (CASCADE delete) |
relationship_type |
TEXT NOT NULL | One of 8 enumerated types |
cardinality |
TEXT | Optional: 1:1, 1:N, N:M |
metadata_json |
TEXT | JSON blob: description, confidence, evidence |
source |
TEXT NOT NULL | Origin tracking |
created_at |
TIMESTAMP | UTC creation time |
Unique constraint: (profile_id, user_uuid, source_entity_id, target_entity_id, relationship_type)
Indexes:
idx_kg_relationships_profile_user— Fast profile/user filteringidx_kg_relationships_source— Outgoing edge queriesidx_kg_relationships_target— Incoming edge queriesidx_kg_relationships_type— Type-filtered queries
Entity Types (8):
| Type | Semantic | Example |
|---|---|---|
database |
Top-level data source | SalesDB, HRSystem |
table |
Relational table/object | orders, employees |
column |
Table column/field | order_date, salary |
foreign_key |
FK constraint entity | orders.customer_id → customers.id |
business_concept |
Domain concept | Customer Lifetime Value, Churn Rate |
taxonomy |
Classification hierarchy | Product Category, Region |
metric |
Measurable KPI | Revenue, NPS Score |
domain |
Business domain | Finance, Marketing |
Relationship Types (8):
| Type | Semantic | Example |
|---|---|---|
contains |
Parent → child containment | database → table, table → column |
foreign_key |
FK reference | orders.customer_id → customers.id |
is_a |
Inheritance/taxonomy | Premium Customer is_a Customer |
has_property |
Attribute association | Order has_property status |
measures |
Metric measurement | Revenue measures Sales Performance |
derives_from |
Calculation derivation | Profit derives_from Revenue - Costs |
depends_on |
Dependency | Report depends_on Daily ETL |
relates_to |
Generic association | Marketing relates_to Sales |
Source Origins (5):
| Source | Meaning | Set By |
|---|---|---|
manual |
User-created via REST API or import | REST endpoints, import |
mcp_discovery |
Auto-discovered from MCP schemas | Discovery module (V2) |
llm_inferred |
Created by LLM during conversation | TDA_KnowledgeGraph tool |
constructor_structural |
Inferred during KG generation (Phase 3 gap-fill) | kg_generate() pipeline |
constructor_fk_inferred |
FK relationships inferred from naming conventions, shared columns, or column properties (Phase 3.5) | kg_generate() pipeline |
Location: components/builtin/knowledge_graph/graph_store.py
The GraphStore class is the single access point for all graph operations. It is instantiated per request with a specific (profile_id, user_uuid) scope:
store = GraphStore(profile_id="profile-123", user_uuid="user-456")The class manages two synchronized layers:
- SQLite — All CRUD operations go directly to the database
- NetworkX — Lazy-loaded
DiGraphcached inself._graph, invalidated by_invalidate_cache()on every write
| Method | Signature | Behavior |
|---|---|---|
add_entity() |
(name, entity_type, properties?, source?, source_detail?) → int |
Upsert via INSERT ON CONFLICT UPDATE. Returns entity ID. |
get_entity() |
(entity_id) → dict? |
Lookup by ID within profile scope. |
get_entity_by_name() |
(name, entity_type?) → dict? |
Case-insensitive name lookup. Optional type filter. |
search_entities() |
(query_text, limit?, entity_type?) → list[dict] |
Substring match on name and properties_json. |
list_entities() |
(entity_type?, limit?) → list[dict] |
List all entities, optionally filtered by type. Default limit: 200. |
update_entity() |
(entity_id, properties) → bool |
Merge new properties into existing properties_json. |
delete_entity() |
(entity_id) → bool |
Delete entity and all relationships (CASCADE). |
| Method | Signature | Behavior |
|---|---|---|
add_relationship() |
(source_entity_id, target_entity_id, relationship_type, cardinality?, metadata?, source?) → int |
Upsert via INSERT ON CONFLICT UPDATE. Returns relationship ID. |
get_relationships() |
(entity_id, direction?) → list[dict] |
Get edges from/to entity. Direction: outgoing, incoming, or both. Deduplicates for self-referential edges. |
list_relationships() |
(entity_id?, relationship_type?, limit?) → list[dict] |
List with optional filters. Default limit: 200. |
delete_relationship() |
(relationship_id) → bool |
Delete single edge. |
| Method | Signature | Behavior |
|---|---|---|
import_bulk() |
(entities, relationships) → {entities_added, relationships_added} |
Batch import. Entities resolved by name for relationship linking. Logs warnings on failures. |
clear_graph() |
() → {entities_deleted, relationships_deleted} |
Delete all data for the profile/user scope. |
Import format:
{
"entities": [
{"name": "orders", "entity_type": "table", "properties": {"description": "Customer orders"}}
],
"relationships": [
{"source_name": "orders", "source_type": "table",
"target_name": "customers", "target_type": "table",
"relationship_type": "foreign_key", "cardinality": "N:1"}
]
}All graph algorithm methods lazy-load the full graph from SQLite into a networkx.DiGraph on first call. The graph is cached in self._graph and invalidated (set to None) by any write operation.
| Method | Signature | Algorithm | Use Case |
|---|---|---|---|
extract_subgraph_adaptive() |
(seed_entity_ids, query_entity_ids?, max_nodes?) → {entities, relationships} |
Three-phase entity-type-aware extraction (see below) | Context enrichment, visualization |
extract_subgraph() |
(entity_ids, depth?, max_nodes?) → {entities, relationships} |
Legacy BFS traversal (both directions) from seed entities | Deprecated — use extract_subgraph_adaptive() |
get_full_graph() |
(max_nodes?) → {entities, relationships} |
Full graph export (capped) | Full visualization |
find_shortest_path() |
(source_id, target_id) → list[int]? |
Undirected shortest path | "How are these tables related?" |
get_connected_entities() |
(entity_id) → {descendants, ancestors} |
Directed reachability | Impact analysis |
get_entity_importance() |
() → {entity_id: float} |
Degree centrality | Node sizing in visualization |
detect_cycles() |
() → list[list[int]] |
Simple cycle detection (capped at 20) | Taxonomy validation |
get_stats() |
() → dict |
Entity/relationship type breakdowns, connected components, density, cycle detection | Dashboard, card metadata |
The extract_subgraph_adaptive() method replaces the fixed-depth BFS with a scalable, entity-type-aware algorithm designed for schemas with 1000s of entities. It uses three phases:
Phase 1: Structural Discovery (unbounded)
├── Phase 1a: FK-Chain Traversal
│ BFS from seeds through table/foreign_key nodes ONLY.
│ NO depth limit → handles 3, 4, 5, N-way JOIN chains.
│ Skips database hub (would explode), columns (detail), semantic nodes.
│
├── Phase 1b: Joinable Table Discovery
│ For each discovered table → collect its column names.
│ Find OTHER tables sharing those column names → join candidates.
│ Iterate up to 3 rounds for transitive joins:
│ Round 1: orders(customer_id) → discovers customers(customer_id)
│ Round 2: customers(region_id) → discovers regions(region_id)
│ Round 3: regions(country_id) → discovers countries(country_id)
│
└── Phase 1c: Database Context
For each discovered table → add its parent database entity.
Context only — don't expand from database nodes.
Phase 2: Column Expansion (budget-aware)
column_budget = max_nodes - structural_count
Sort tables: query-matched first, then by distance from seeds.
Add columns per-table until budget exhausted.
Phase 3: Semantic Enrichment (capped at 50)
Add business_concept, metric, taxonomy, domain nodes
connected to discovered structural entities.
Key design choices:
- Entity-type awareness: Only
tableandforeign_keynodes are expanded during structural discovery. Database hub nodes (which connect to all tables) are explicitly excluded to prevent explosion. - No fixed depth limit: FK chains are traversed without bound, ensuring 3, 4, 5, or N-way JOINs are always discovered.
- Iterative joinable-table discovery: Even without explicit FK edges, tables are connected through shared column names (e.g.,
customer_idin bothordersandcustomers). Up to 3 iterative rounds discover transitive join paths. - Budget-aware column expansion: Columns are added only after structural discovery, prioritizing query-matched tables and limiting to the remaining node budget.
- Default
max_nodes=500(up from 50): Supports schemas with hundreds of tables while keeping output manageable.
Scalability:
| Schema size | Typical result |
|---|---|
| 10 tables | All tables + all columns (~100 nodes) |
| 100 tables | ~30-50 relevant tables + columns for top priority (~300 nodes) |
| 1000 tables | ~30-100 relevant tables + columns for top ~50 (~500 nodes) |
All phases are O(V + E) combined.
Static method (no instance needed):
GraphStore.list_all_graphs(user_uuid) → list[dict]Returns one entry per profile_id that has at least one entity, with summary statistics (counts, type breakdowns, timestamps). Used by the management UI to show all knowledge graphs across profiles.
Location: components/builtin/knowledge_graph/handler.py
The KnowledgeGraphHandler serves dual purposes:
- Tool handler: Processes
TDA_KnowledgeGraphtool calls from the LLM - Context enrichment: Provides
get_context_enrichment()for planner guardrail injection
The TDA_KnowledgeGraph tool accepts an action parameter routing to five handlers:
| Action | Arguments | Returns | Purpose |
|---|---|---|---|
query |
query_text?, entity_type? |
HTML results or stats | Search entities, show graph statistics |
visualize |
entity_name?, depth?, title? |
D3 force-graph spec | Interactive subgraph visualization |
add_entity |
entity_name, entity_type, properties? |
Confirmation HTML | Add entity (source: llm_inferred) |
add_relationship |
entity_name, target_entity, relationship_type, entity_type?, target_entity_type? |
Confirmation HTML | Create edge between entities |
get_context |
query_text? |
Pre-formatted enrichment text | Debug/test context enrichment output |
All actions return ComponentRenderPayload with either html (inline display) or spec (D3 visualization data).
The core intelligence feature. Called automatically before every strategic planning call:
async def get_context_enrichment(query, profile_id, user_uuid) -> strPipeline:
User Query: "Show me orders with high revenue customers"
│
▼
┌───────────────────────────────┐
│ 1. Tokenize query │
│ → ["orders", "high", │
│ "revenue", "customers"]│
└───────────────┬───────────────┘
│
▼
┌───────────────────────────────┐
│ 2. Search entities │
│ Full query + individual │
│ keywords (skip stop words)│
│ → match: orders (table), │
│ revenue (metric), │
│ customers (table) │
└───────────────┬───────────────┘
│
▼
┌───────────────────────────────┐
│ 3. Adaptive extraction │
│ max_nodes=500 │
│ → FK chains (unbounded), │
│ joinable tables, │
│ budget-aware columns │
└───────────────┬───────────────┘
│
▼
┌───────────────────────────────┐
│ 4. Format for prompt │
│ Structured text with │
│ entity types, properties, │
│ and relationships │
└───────────────┬───────────────┘
│
▼
Injected into planner's
knowledge_context_str
The search combines two strategies for maximum recall:
- Direct substring search: The full query string is matched against entity names and
properties_json - Tokenized keyword search: Individual words (3+ chars, stop words excluded) are each searched independently
Results are deduplicated by entity ID and capped at 15 matches.
Stop words excluded: the, show, get, list, find, all, from, with, for, and, that, this, what, how, many, much, are, was, were, has, have, been, can, will, please, help, give, tell, about, into, each
The enrichment text follows a structured format injected into the planner's knowledge context:
--- KNOWLEDGE GRAPH CONTEXT ---
The following known entities and relationships may inform your planning:
TABLE ENTITIES:
- orders (Customer purchase records; type: table)
- customers (Customer master data; type: table)
COLUMN ENTITIES:
- customer_id (FK to customers; type: integer)
- order_date (Purchase timestamp; type: date)
- amount (Order total; business: Revenue metric source)
METRIC ENTITIES:
- Revenue (Total order amounts; business: Primary KPI)
KNOWN RELATIONSHIPS:
- SalesDB --[contains]--> orders
- orders --[contains]--> customer_id
- orders --[foreign_key [N:1]]--> customers — Via customer_id
- Revenue --[measures]--> orders.amount
--- END KNOWLEDGE GRAPH CONTEXT ---
Renderer: components/builtin/knowledge_graph/renderer.js (D3 v7 via CDN)
The visualization follows the Canvas component's progressive display pattern:
| Mode | Entry Point | Size | Purpose |
|---|---|---|---|
| A. Inline Compact | renderKGInlineCompact() |
200px height in chat | Mini preview with simplified force layout |
| B. Split Panel | openKGSplitPanel() |
50% viewport, side-by-side with chat | Full interactive graph with toolbar |
| C. Fullscreen | toggleKGFullscreen() |
100% viewport | Maximum graph exploration space |
Transition flow:
Chat message → [Inline Compact]
│
├── "Open in Graph →" button
▼
[Split Panel]
│
├── Fullscreen button (⊞)
▼
[Fullscreen]
│
├── Exit fullscreen
▼
[Split Panel]
│
├── Close button (×)
▼
[Panel hidden]
Mutual exclusion: Opening the KG split panel automatically closes the Canvas split panel (and vice versa), dispatching kg-requesting-split / canvas-requesting-split events for cleanup.
| Feature | Description | Implementation |
|---|---|---|
| Force-directed layout | Spring-charge physics for organic node positioning | d3.forceSimulation with link, charge, center, collision forces |
| Zoom & pan | Mouse wheel zoom (0.1x–4x), drag to pan | d3.zoom() on SVG container |
| Node drag | Reposition individual nodes | d3.drag() with simulation restart |
| Hover glow | Entity-type-colored glow filter on hover | SVG feGaussianBlur + feFlood filter per type |
| Click focus | Click node to highlight connected subgraph, dim unrelated | Connected-set computation, opacity animation |
| Edge gradients | Edges colored as gradient from source to target type | SVG linearGradient per link |
| Directional arrows | Arrowhead markers on edges | SVG marker with refX offset |
| Search | Toolbar input filters nodes by name (live) | Opacity toggle on input event |
| Entity type filters | Colored pills toggle entity type visibility | display:none on filtered nodes + connected edges |
| Zoom to fit | Button resets viewport | d3.zoom.transform transition |
| Export PNG | Button exports 2x resolution PNG | SVG → Image → Canvas → Blob download |
| Tooltips | Hover shows entity properties, type, business meaning | HTML overlay positioned at cursor |
| Legend | Color-coded entity type legend overlay | Bottom-right glass panel |
| Animated entry | Nodes fade in sequentially, edges draw progressively | CSS transitions + stroke-dashoffset animation |
| Center entity pulse | Pulsing ring animation on the queried entity | CSS @keyframes kg-pulse |
| Node importance sizing | Higher-centrality nodes rendered larger | Degree centrality → circle radius |
The renderer uses CSS custom properties for all colors, ensuring compatibility across all three themes (legacy dark, modern, light):
/* SVG elements use CSS fill/stroke to evaluate variables live */
.kg-node-name { fill: var(--text-primary, #e5e7eb); }
.kg-edge-label { fill: var(--text-muted, #6b7280); opacity: 0.5; }
.kg-mini-label { fill: var(--text-muted, #9ca3af); }
.kg-mini-link { stroke: var(--border-primary, #4b5563); }
/* Light theme adjustments */
[data-theme="light"] .kg-edge-label { opacity: 0.7; }
[data-theme="light"] .kg-mini-link { stroke-opacity: 0.6; }Entity type colors are provided by the backend handler (ENTITY_TYPE_COLORS) and passed through the D3 spec. These are constant across themes:
| Entity Type | Color | Hex |
|---|---|---|
| database | Blue | #3b82f6 |
| table | Green | #22c55e |
| column | Lime | #a3e635 |
| foreign_key | Amber | #f59e0b |
| business_concept | Violet | #8b5cf6 |
| taxonomy | Pink | #ec4899 |
| metric | Cyan | #06b6d4 |
| domain | Orange | #f97316 |
Knowledge graphs are managed in two UI locations:
1. Resource Panel (Sidebar)
Location: knowledgeGraphPanelHandler.js → loadKnowledgeGraphsPanel()
- Compact cards using CSS variables (
--card-bg,--border-primary) - Active indicator for the profile whose KG matches the current session's profile
- Export button (downloads
.jsonfile) - Delete button (styled confirmation modal)
- IFOC badge showing profile type with color coding
2. Intelligence Performance Page (Third Tab)
Location: knowledgeGraphPanelHandler.js → loadKnowledgeGraphsIntelligenceTab()
- Full
glass-panelcards withcard-btnaction system - Entity/relationship count display
- Entity type pills showing type breakdown
- Import button with file picker (
.json) - Export and Delete buttons
- Promote button (placeholder for future cross-profile migration)
- Tab counter showing number of knowledge graphs
- iPadOS file picker workaround (removes
acceptattribute)
Cross-panel refresh: Deleting or importing a KG from either location refreshes both panels to maintain consistency.
All endpoints are JWT-authenticated and scoped to the current user. The profile_id is provided as a query parameter.
| Endpoint | Method | Parameters | Response |
|---|---|---|---|
/v1/knowledge-graph/entities |
POST |
Body: {profile_id, name, entity_type, properties?, source?} |
{status, entity: {id, name, ...}} |
/v1/knowledge-graph/entities |
GET |
Query: profile_id, entity_type?, search? |
{status, entities: [...]} |
/v1/knowledge-graph/entities/<id> |
PUT |
Body: {profile_id, properties} |
{status, updated: bool} |
/v1/knowledge-graph/entities/<id> |
DELETE |
Query: profile_id |
{status, deleted: bool} |
| Endpoint | Method | Parameters | Response |
|---|---|---|---|
/v1/knowledge-graph/relationships |
POST |
Body: {profile_id, source_entity_id, target_entity_id, relationship_type, cardinality?, metadata?} |
{status, relationship: {id, ...}} |
/v1/knowledge-graph/relationships |
GET |
Query: profile_id, entity_id?, relationship_type? |
{status, relationships: [...]} |
/v1/knowledge-graph/relationships/<id> |
DELETE |
Query: profile_id |
{status, deleted: bool} |
| Endpoint | Method | Parameters | Response |
|---|---|---|---|
/v1/knowledge-graph/subgraph |
GET |
Query: profile_id, entity_name, max_nodes? |
{status, nodes, edges, node_count} |
/v1/knowledge-graph/stats |
GET |
Query: profile_id |
{status, stats: {total_entities, total_relationships, entity_types, ...}} |
/v1/knowledge-graph/import |
POST |
Body: {profile_id, entities: [...], relationships: [...]} |
{status, entities_added, relationships_added} |
/v1/knowledge-graph/clear |
DELETE |
Query: profile_id |
{status, entities_deleted, relationships_deleted} |
/v1/knowledge-graph/context |
GET |
Query: profile_id, query |
{status, context_text, entity_count} |
/v1/knowledge-graph/generate |
POST |
Body: {profile_id, llm_config_id?} |
{status, structural, semantic, phase3_relationships, phase3_5_fk_relationships, phase4_relationships, total} |
/v1/knowledge-graph/discover |
POST |
Body: {profile_id, tools: [...]} |
{status, entities_discovered, relationships_discovered} (V2 stub) |
| Endpoint | Method | Parameters | Response |
|---|---|---|---|
/v1/knowledge-graph/list |
GET |
None (user-scoped) | {status, knowledge_graphs: [{profile_id, total_entities, entity_types, ...}]} |
/v1/knowledge-graph/export |
GET |
Query: profile_id |
JSON file download (Content-Disposition: attachment) |
Export file format:
{
"knowledge_graph_export": true,
"version": "1.0",
"profile_id": "profile-123",
"profile_tag": "@OPTIM",
"exported_at": "2026-02-27T10:30:00+00:00",
"stats": {
"total_entities": 42,
"total_relationships": 38
},
"entities": [...],
"relationships": [...]
}The knowledge graph injects context into the strategic planner via the component system's get_context_enrichment() pipeline:
planner.py:2609
│
▼
get_component_context_enrichment(query, profile_id, user_uuid)
│ (manager.py:93)
▼
ComponentManager.get_context_enrichment()
│ (manager.py:658)
│ Iterates active components with supports_context_enrichment=true
▼
KnowledgeGraphHandler.get_context_enrichment()
│ (handler.py:116)
│ 1. Search entities for query keywords
│ 2. Adaptive subgraph extraction (max_nodes=500)
│ → FK chains, joinable tables, budget-aware columns
│ 3. Format as structured text
▼
knowledge_context_str += enrichment
│ (planner.py:2618)
▼
Injected into WORKFLOW_META_PLANNING_PROMPT
The enrichment text is appended to the planner's knowledge_context_str, which already includes knowledge repository retrieval results. This means the LLM receives both RAG case examples and knowledge graph topology in a single context section.
The knowledge graph plugs into the standard component lifecycle:
| Integration Point | Mechanism | File |
|---|---|---|
| Tool definition | manifest.json → tool_definition |
Injected into LLM context via get_tool_definitions() |
| Instructions | instructions.json → intensity-keyed text |
Injected via get_instructions_text() |
| Fast-path execution | is_deterministic = True |
Bypasses tactical LLM in phase_executor.py |
| LangChain tool | Auto-generated StructuredTool |
For llm_only, rag_focused, genie profiles |
| Context enrichment | supports_context_enrichment: true |
Called before strategic planning |
| Renderer discovery | frontend.renderer_file + CDN deps |
D3 loaded on first use |
The knowledge graph is enabled per-profile via componentConfig:
{
"componentConfig": {
"knowledge_graph": {
"enabled": true,
"intensity": "medium"
}
}
}Intensity levels:
| Level | LLM Behavior |
|---|---|
none |
KG disabled — no tool, no instructions, no enrichment |
medium |
Use graph context as guidance when available. Add entities/relationships when user explicitly requests. |
heavy |
Strictly validate against graph. Proactively suggest adding discovered entities. Always offer to visualize relationships. |
Default: Enabled for all profile types (tool_enabled, llm_only, rag_focused, genie) at medium intensity.
Split panel DOM is defined in templates/index.html:
<aside id="kg-split-panel" style="display:none;">
<div class="kg-split-header">
<span id="kg-split-title" class="kg-split-title-text">Knowledge Graph</span>
<div class="kg-split-header-actions">
<button id="kg-split-fullscreen" class="kg-split-action-btn">...</button>
<button id="kg-split-close" class="kg-split-action-btn">...</button>
</div>
</div>
<div id="kg-split-content" class="kg-split-body"></div>
</aside>The split panel persists across messages within a session — when a new KG visualization is rendered inline, clicking "Open in Graph" updates the existing panel rather than creating a new one. The panel state (open/closed, current spec) is stored in module-level variables in renderer.js.
Location: components/builtin/knowledge_graph/discovery.py
The MCPSchemaDiscovery class provides a placeholder for automatic knowledge graph population from MCP server tool schemas. Currently returns {entities_discovered: 0, relationships_discovered: 0}.
V2 Implementation Plan:
- Parse tool
inputSchemadefinitions from MCP servers - Extract
database_name,object_name,column_nameparameters → create entities - Infer
table → columncontainment relationships from tool groupings - Detect foreign key hints from parameter descriptions
- Extract argument
enumvalues → create taxonomy nodes - Parse resource URIs for additional schema information
Trigger: POST /v1/knowledge-graph/discover with tool definitions in the body.
| Concern | Mechanism |
|---|---|
| Multi-user isolation | All queries scoped by (profile_id, user_uuid) composite key |
| Cross-user visibility | Impossible — no endpoint accepts user_uuid as input; derived from JWT |
| Cross-profile reads | list_all_graphs() only returns metadata (counts), not entity data |
| SQL injection | Parameterized queries throughout GraphStore |
| Bulk import validation | Entity types validated against ENTITY_TYPES enum; invalid types raise ValueError |
| JWT authentication | All REST endpoints decorated with @require_auth |
| Operation | Complexity | Notes |
|---|---|---|
| Entity CRUD | O(1) per operation | SQLite indexed by profile + user |
| Substring search | O(n) | Scans entities within profile scope; V2 will use embedding-based search |
| NetworkX load | O(E + V) | Full graph loaded from SQLite on first algorithm call |
| Adaptive subgraph extraction | O(V + E) | Three-phase: structural (unbounded FK chains), column expansion (budget-aware), semantic enrichment (capped) |
| Legacy subgraph BFS | O(V + E) | Bounded by depth and max_nodes parameters (deprecated) |
| Centrality | O(V + E) | Computed over full graph |
| Cycle detection | O(V + E) | Capped at 20 cycles |
| Context enrichment | ~5-20ms | Search + adaptive extraction + format for typical graphs (<500 entities) |
| FK inference (ingestion) | O(T² × C) | T=tables, C=avg columns per table; runs once during KG generation |
| D3 rendering | Browser-side | Force simulation self-terminates; mini graph stops after 3s |
NetworkX cache invalidation: Any write operation (add, update, delete, import, clear) sets self._graph = None. The next graph algorithm call re-loads the full graph from SQLite. For write-heavy workloads, consider batching mutations before querying algorithms.
| Feature | Status | Description |
|---|---|---|
| Entity/Relationship CRUD | Shipped | Full CRUD via REST API and LLM tool |
| D3 force-directed visualization | Shipped | 3-mode progressive display with interactive features |
| Planner context enrichment | Shipped | Automatic subgraph injection into strategic planning |
| Resource Panel management | Shipped | Browse, export, delete across profiles |
| Intelligence Performance tab | Shipped | Import, export, delete with IFOC badges |
| Bulk import/export | Shipped | JSON file format with profile metadata |
| Adaptive subgraph extraction | Shipped | Entity-type-aware, unbounded FK chains, budget-aware columns (replaces fixed-depth BFS) |
| FK edge inference (Phase 3.5) | Shipped | Deterministic FK detection via column properties, naming conventions, and shared column names |
| Promote (cross-profile copy) | Planned | Copy KG from one profile to another |
| MCP schema auto-discovery | V2 Stub | Parse MCP tool schemas to populate graph automatically |
| Embedding-based semantic search | Planned | Replace substring matching with vector similarity |
| Auto-enrichment during execution | Planned | Automatically capture discovered schema from tool results |
| Marketplace/AgentPack integration | Planned | Share knowledge graphs as part of agent pack bundles |
| Graph merge/conflict resolution | Planned | Merge two profile graphs with deduplication |
| File | Purpose | Key Classes/Functions |
|---|---|---|
components/builtin/knowledge_graph/graph_store.py |
Dual-layer graph engine | GraphStore, ENTITY_TYPES, RELATIONSHIP_TYPES |
components/builtin/knowledge_graph/handler.py |
Component handler + context enrichment | KnowledgeGraphHandler, ENTITY_TYPE_COLORS |
components/builtin/knowledge_graph/manifest.json |
Component metadata + tool definition | TDA_KnowledgeGraph tool spec |
components/builtin/knowledge_graph/instructions.json |
Intensity-keyed LLM instructions | none, medium, heavy |
components/builtin/knowledge_graph/discovery.py |
MCP schema discovery (V2 stub) | MCPSchemaDiscovery |
schema/21_knowledge_graph.sql |
Database schema | kg_entities, kg_relationships |
src/trusted_data_agent/api/rest_routes.py |
REST API endpoints (15) | kg_add_entity, kg_list_all, kg_export, etc. |
src/trusted_data_agent/components/manager.py |
Component manager orchestration | get_component_context_enrichment() |
src/trusted_data_agent/components/base.py |
Base handler class | BaseComponentHandler.get_context_enrichment() |
| File | Purpose | Key Functions |
|---|---|---|
components/builtin/knowledge_graph/renderer.js |
D3 visualization (3 modes) | renderKnowledgeGraph(), openKGSplitPanel(), closeKGSplitPanel() |
static/js/handlers/knowledgeGraphPanelHandler.js |
Management UI (2 locations) | loadKnowledgeGraphsPanel(), loadKnowledgeGraphsIntelligenceTab(), initializeImportHandler() |
static/js/api.js |
API client functions | loadKnowledgeGraphList(), exportKnowledgeGraph(), deleteKnowledgeGraph(), importKnowledgeGraph() |
templates/index.html |
Split panel DOM + Intelligence tab | #kg-split-panel, #knowledge-graphs-tab-content |
| File | Purpose | Lines |
|---|---|---|
src/trusted_data_agent/agent/planner.py |
Planner context injection | 2609–2621 |
src/trusted_data_agent/agent/phase_executor.py |
Tactical context enrichment | 2425–2466 |
static/js/handlers/ragCollectionManagement.js |
Intelligence tab wiring | initializeKnowledgeGraphIntelligenceHandlers() |