Skip to content

Commit 6d7cd38

Browse files
Merge branch 'ftr/media-player-v2' into test-deployment
2 parents 00f6e32 + a508759 commit 6d7cd38

File tree

135 files changed

+20548
-1809
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

135 files changed

+20548
-1809
lines changed

.dockerignore

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# ABOUTME: Excludes files from the frontend Docker build context (context: . in CI).
2+
# ABOUTME: The backend Dockerfile uses context: ./backend, so this file does not affect it.
3+
4+
# Backend (has its own Docker context at ./backend)
5+
backend/
6+
7+
# Infrastructure and docs
8+
terraform/
9+
DEVELOPMENT_DOCS/
10+
DOCS/
11+
.planning/
12+
13+
# VCS and IDE
14+
.git/
15+
.github/
16+
.claude/
17+
.kiro/
18+
.vscode/
19+
20+
# Secrets (must never enter Docker context)
21+
.env
22+
.env.*
23+
!.env.example
24+
25+
# Dependencies (reinstalled inside Docker)
26+
node_modules/
27+
frontend/node_modules/
28+
frontend/.next/
29+
30+
# Misc
31+
.DS_Store
32+
LICENSE
33+
README.md
34+
GUIDE.md
35+
CLAUDE.md
36+
Makefile
37+
lefthook.yml
38+
docker-compose.yml

.github/workflows/deploy.yml

Lines changed: 4 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@ env:
1414
FRONTEND_SERVICE: holmes-frontend
1515
# Stabilize any generation that may depend on hash iteration order.
1616
PYTHONHASHSEED: "0"
17-
# Fast deploy mode for test-deployment branch (parallel builds, less defensive)
18-
FAST_DEPLOY: ${{ github.ref_name == 'test-deployment' }}
1917

2018
jobs:
2119
# Job 1: Type generation check
@@ -106,7 +104,7 @@ jobs:
106104
uses: docker/setup-buildx-action@v3
107105

108106
- name: Build and push backend image
109-
uses: docker/build-push-action@v5
107+
uses: docker/build-push-action@v6
110108
with:
111109
context: ./backend
112110
push: true
@@ -125,30 +123,17 @@ jobs:
125123
region: ${{ env.REGION }}
126124
image: ${{ env.REGION }}-docker.pkg.dev/${{ vars.GCP_PROJECT_ID }}/holmes/backend:${{ github.sha }}
127125

128-
# Job 3: Frontend lint and build
129-
# On test-deployment: runs in parallel with backend (fast mode)
130-
# On main/development: waits for backend to complete first (defensive mode)
126+
# Job 3: Frontend lint and build (runs in parallel with backend)
131127
frontend:
132128
runs-on: ubuntu-latest
133129
needs: types
134130
permissions:
135131
contents: read
136132
id-token: write
137-
checks: read
138133

139134
steps:
140135
- uses: actions/checkout@v4
141136

142-
# Defensive mode: wait for backend job to complete before proceeding
143-
- name: Wait for backend deployment (defensive mode)
144-
if: github.ref_name == 'main' || github.ref_name == 'development'
145-
uses: lewagon/wait-on-check-action@v1.3.4
146-
with:
147-
ref: ${{ github.sha }}
148-
check-name: backend
149-
repo-token: ${{ secrets.GITHUB_TOKEN }}
150-
wait-interval: 10
151-
152137
- uses: oven-sh/setup-bun@v2
153138
with:
154139
bun-version: latest
@@ -165,6 +150,7 @@ jobs:
165150
bun install
166151
cd frontend
167152
bun run lint
153+
bunx prettier --check src
168154
169155
- name: Authenticate to Google Cloud
170156
uses: google-github-actions/auth@v2
@@ -206,7 +192,7 @@ jobs:
206192
uses: docker/setup-buildx-action@v3
207193

208194
- name: Build and push frontend image
209-
uses: docker/build-push-action@v5
195+
uses: docker/build-push-action@v6
210196
with:
211197
context: .
212198
file: frontend/Dockerfile
@@ -218,7 +204,6 @@ jobs:
218204
NEXT_PUBLIC_API_URL=${{ steps.backend-url.outputs.url }}
219205
NEXT_PUBLIC_VIDEO_URL=https://storage.googleapis.com/${{ vars.GCP_PROJECT_ID }}-media/video.mp4
220206
NEXT_PUBLIC_APP_URL=${{ steps.frontend-url.outputs.url }}
221-
BETTER_AUTH_URL=${{ steps.frontend-url.outputs.url }}
222207
cache-from: |
223208
type=registry,ref=${{ env.REGION }}-docker.pkg.dev/${{ vars.GCP_PROJECT_ID }}/holmes/frontend:cache-${{ github.ref_name }}
224209
type=registry,ref=${{ env.REGION }}-docker.pkg.dev/${{ vars.GCP_PROJECT_ID }}/holmes/frontend:cache-main

.planning/ROADMAP.md

Lines changed: 76 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
| 4 | Core Agent System | ADK setup, Triage Agent, Orchestrator, Research/Discovery stubs | REQ-AGENT-001/002/007/007a/007b/007e | ✅ COMPLETE |
2929
| 4.1 | Agent Decision Tree Revamp (INSERTED) | Replace D3 Command Center with @xyflow/react + dagre decision tree | REQ-VIS-001 (visual quality) | ✅ COMPLETE |
3030
| 5 | Agent Flow | Real-time visualization, SSE streaming, HITL dialogs | REQ-VIS-001/001a/002, REQ-INF-004 | ✅ COMPLETE |
31-
| 6 | Domain Agents | Financial, Legal, Strategy, Evidence agents, Entity taxonomy, Hypothesis evaluation | REQ-AGENT-003/004/005/006/007c/007d/007h, REQ-HYPO-002/003 | ⏳ NOT_STARTED |
31+
| 6 | Domain Agents | Financial, Legal, Strategy, Evidence agents, Entity taxonomy, Hypothesis evaluation | REQ-AGENT-003/004/005/006/007c/007d/007h, REQ-HYPO-002/003 | ✅ COMPLETE |
3232
| 7 | Synthesis & Knowledge Graph | Synthesis Agent, KG Agent, Hypothesis system, Task generation, 5-layer KG | REQ-AGENT-008/009, REQ-VIS-003, REQ-HYPO-001/004/005/006, REQ-TASK-001/002 | 🟡 FRONTEND_DONE |
3333
| 8 | Intelligence Layer & Geospatial | Contradictions, Gaps, Geospatial Agent, Map View, Earth Engine | REQ-WOW-*, REQ-VIS-005/006, REQ-GEO-* | ⏳ NOT_STARTED |
3434
| 9 | Chat Interface & Research | Chat UI, Research/Discovery (Chat + Orchestrator-triggered), Hypothesis View, Context caching | REQ-CHAT-*, REQ-RESEARCH-*, REQ-HYPO-007/008 | 🟡 FRONTEND_DONE |
@@ -37,7 +37,7 @@
3737
| 12 | Demo Preparation | Demo case showcasing all integration features | Demo readiness, REQ-RESEARCH-004, REQ-AGENT-007i | ⏳ NOT_STARTED |
3838

3939
> **Status Legend:** ✅ COMPLETE | 🟡 FRONTEND_DONE (backend pending) | ⏳ NOT_STARTED | ⏳ PLANNED
40-
> **Note:** Phase 5 complete (2026-02-05, full SSE pipeline + HITL). Phases 7, 9, 10 have frontend UI implemented by Yatharth (2026-02-02). Backend integration remains for those phases.
40+
> **Note:** Phase 6 complete (2026-02-06, 35 commits: 5 plans + refactoring + routing HITL + production hardening + live-testing bugfixes). Phases 7, 9, 10 have frontend UI implemented by Yatharth (2026-02-02). Backend integration remains for those phases.
4141
4242
**Post-MVP:**
4343
| Phase | Name | Focus | Requirements |
@@ -464,14 +464,28 @@ The Command Center frontend was built in three stages:
464464

465465
**Goal:** Implement all four domain analysis agents with proper thinking configuration.
466466

467+
**Status:** ✅ COMPLETE (2026-02-06) — 5 plans (14 commits) + 21 post-plan commits (35 total)
468+
469+
**Verification:** `.planning/phases/06-domain-agents/06-VERIFICATION.md` — 10/10 must-haves verified + post-plan addendum
470+
467471
**Requirements:** REQ-AGENT-003, REQ-AGENT-004, REQ-AGENT-005, REQ-AGENT-006, REQ-AGENT-007b, REQ-AGENT-007c, REQ-AGENT-007d, REQ-AGENT-007h, REQ-AGENT-002 (complete), REQ-HYPO-002, REQ-HYPO-003
468472

473+
**Plans:** 5 plans in 3 waves
474+
475+
Plans:
476+
- [x] 06-01-PLAN.md — Domain output schemas, factory extension, infrastructure updates
477+
- [x] 06-02-PLAN.md — Domain agent prompts (Financial, Legal, Evidence, Strategy)
478+
- [x] 06-03-PLAN.md — Financial, Legal, Evidence agent modules + parallel runner
479+
- [x] 06-04-PLAN.md — Strategy agent module (sequential, receives domain summaries)
480+
- [x] 06-05-PLAN.md — Pipeline wiring, SSE events, HITL confirmation integration
481+
482+
469483
**Deliverables:**
470-
- Financial Analysis Agent (`thinking_level="medium"`, `media_resolution="high"`)
484+
- Financial Analysis Agent (`thinking_level="high"`, `media_resolution="high"`)
471485
- **Full entity taxonomy for financial domain** (monetary_amount, account, transaction, asset)
472486
- Legal Analysis Agent (`thinking_level="high"`, `media_resolution="high"`)
473487
- **Full entity taxonomy for legal domain** (statute, case_citation, contract, legal_term, court)
474-
- Strategy Analysis Agent (`thinking_level="medium"`)
488+
- Strategy Analysis Agent (`thinking_level="high"`, `media_resolution="medium"`)
475489
- Evidence Analysis Agent (`thinking_level="high"`, `media_resolution="high"`)
476490
- Authenticity analysis (manipulation detection, metadata consistency)
477491
- Chain of custody documentation
@@ -480,38 +494,69 @@ The Command Center frontend was built in three stages:
480494
- **Full entity taxonomy for evidence domain** (communication, alias, vehicle, property, timestamp)
481495
- **Hypothesis evaluation in all domain agent prompts**
482496
- Agents evaluate findings against existing hypotheses
483-
- Output includes hypothesis_evaluations and new_hypotheses
484-
- Parallel execution via ADK ParallelAgent
485-
- ResilientAgentWrapper for each domain agent (Pro → Flash fallback)
486-
- Domain-specific tool definitions
487-
- Video/audio processing with VideoMetadata for timestamps
488-
- Structured output schemas per agent
497+
- Output includes hypothesis_evaluations
498+
- Parallel execution via asyncio.gather (not ADK ParallelAgent)
499+
- Inline Pro-to-Flash fallback for each domain agent
500+
- Video/audio processing via Gemini File API
501+
- Structured output schemas per agent (Pydantic models)
489502
- Span-level citation extraction
490503
- Agent output aggregation for Synthesis
491504
- **HITL E2E verification** (deferred from Phase 5): Domain agents trigger confirmations for sensitive operations
505+
- **DomainAgentRunner Template Method base class** (post-plan refactoring)
506+
- All 4 domain agents migrated to subclasses (~800 lines of duplication eliminated)
507+
- `extract_structured_json` generic parser replaces per-agent parse functions
508+
- **Per-agent routing HITL system** (post-plan feature)
509+
- Routing confidence scoring with per-agent-type thresholds
510+
- Batch confirmation modal with per-agent rejection
511+
- Strategy agent standalone execution with HITL
512+
- **Production hardening** (post-plan)
513+
- State snapshot refresh resilience (lastResult preservation)
514+
- Exception handling in domain agent runner for SSE error emission
515+
- Orchestrator execution committed to DB before domain agent launch
516+
- JSON thinking trace normalization for Gemini multimodal output
517+
- **Pipeline bugfixes from live testing** (post-plan)
518+
- compute_agent_tasks covered-pairs tracking for per-file multi-agent routing
519+
- Strategy gated on orchestrator routing decision
520+
- Routing decisions flattened to one card per (file, agent) pair
521+
- Thought parts excluded from JSON parsing
492522

493523
**Technical Notes:**
494-
- Each agent has unique output_key to avoid race conditions
495524
- All agents receive file content directly (Gemini multimodal)
496525
- Citation format: `{file_id}#{locator}` where locator is page/timestamp/region
497-
- Domain agents run in parallel after Orchestrator routing
526+
- Domain agents run in parallel via asyncio.gather after Orchestrator routing
498527
- Use `media_resolution="high"` for dense document processing
499-
- Video segments: use `VideoMetadata(start_offset, end_offset)`
500-
- Audio: request speaker diarization in prompts
501-
- ResilientAgentWrapper catches failures and falls back to Flash model
528+
- Video/audio forced through Gemini File API regardless of size
529+
- Audio: request speaker diarization in prompts (best-effort)
530+
- Inline Pro-to-Flash fallback pattern (not separate ResilientAgentWrapper class)
531+
- **DomainAgentRunner** Template Method base class: subclasses override agent_type, output_type, _create_agent
532+
- **extract_structured_json** generic parser: filters thought parts, handles code fences, validates via Pydantic
533+
- **format_thinking_traces**: normalizes JSON-structured thinking (common with multimodal) to readable text
534+
- **compute_agent_tasks** uses covered_pairs set[tuple[str, str]] to track (file_id, agent_type) coverage
535+
- **Per-agent routing HITL**: ROUTING_CONFIDENCE_THRESHOLDS per agent type, batch confirmation modal
536+
- **Strategy gating**: only runs when explicitly requested by orchestrator (parallel_agents/sequential_agents/routing_decisions)
502537
- **Domain agent prompts include: "Evaluate findings against existing hypotheses"**
503-
504-
**Exit Criteria:**
505-
- All four domain agents process files
506-
- Parallel execution verified
507-
- Thinking traces captured for all agents
508-
- Video/audio processed with timestamp extraction
509-
- Graceful degradation works (fallback to Flash)
510-
- Structured findings with citations output
511-
- **Hypothesis evaluations included in agent output**
512-
- **Domain-specific entity taxonomy extracted**
513-
- Outputs aggregated for next phase
514-
- **HITL confirmation flow verified E2E** (agent triggers → modal appears → user responds → agent continues)
538+
- **Key architecture files:**
539+
- `backend/app/agents/domain_agent_runner.py` — DomainAgentRunner Template Method base class
540+
- `backend/app/agents/domain_runner.py` — compute_agent_tasks, run_domain_agents_parallel, build_strategy_context
541+
- `backend/app/agents/parsing.py` — extract_structured_json, extract_response_texts, format_thinking_traces
542+
- `backend/app/api/agents.py` — Pipeline wiring, strategy gating, routing HITL, SSE emission
543+
- `backend/app/api/sse.py` — State snapshots, thinking trace normalization, routing decision flattening
544+
545+
**Exit Criteria:** ✓ ALL MET (10/10 + post-plan hardening)
546+
- ✅ All four domain agents process files (migrated to DomainAgentRunner base class)
547+
- ✅ Parallel execution via asyncio.gather with independent DB sessions
548+
- ✅ Thinking traces captured for all agents (JSON normalized for multimodal)
549+
- ✅ Video/audio forced through File API for reliable processing
550+
- ✅ Graceful degradation works (inline Pro-to-Flash fallback)
551+
- ✅ Structured findings with span-level citations output
552+
- ✅ Hypothesis evaluations included in agent output
553+
- ✅ Domain-specific entity taxonomy extracted
554+
- ✅ Outputs aggregated for next phase (build_strategy_context + domain_results dict)
555+
- ✅ HITL confirmation flow verified E2E (agent triggers → modal appears → user responds → agent continues)
556+
- ✅ Per-agent routing HITL with batch confirmation (post-plan)
557+
- ✅ Strategy agent gated on orchestrator routing decision (post-plan)
558+
- ✅ Routing decisions display all target agents per file (post-plan)
559+
- ✅ State snapshot refresh resilience with lastResult preservation (post-plan)
515560

516561
---
517562

@@ -974,8 +1019,8 @@ For 2 developers working simultaneously:
9741019

9751020
---
9761021

977-
*Roadmap Version: 2.2*
978-
*Updated: 2026-02-05 (Phase 5 complete)*
1022+
*Roadmap Version: 2.4*
1023+
*Updated: 2026-02-06 (Phase 6 complete — 35 commits including post-plan hardening)*
9791024
*Phase 1 planned: 2026-01-20*
9801025
*Phase 1.1 planned: 2026-01-23*
9811026
*Phase 1.1 complete: 2026-01-24*
@@ -990,3 +1035,5 @@ For 2 developers working simultaneously:
9901035
*Phase 4.1 complete: 2026-02-04 (all 4 plans done, 18 commits)*
9911036
*Phase 5 planned: 2026-02-04 (4 plans in 3 waves)*
9921037
*Phase 5 complete: 2026-02-05 (all 4 plans + 15 post-plan fixes, 26 commits total)*
1038+
*Phase 6 planned: 2026-02-05 (5 plans in 3 waves)*
1039+
*Phase 6 complete: 2026-02-06 (5 plans + 21 post-plan commits = 35 total, 10/10 verified + hardening)*

0 commit comments

Comments
 (0)