Skip to content

Commit 917da91

Browse files
Copilottikazyq
andcommitted
Update design docs with finalized TS+Go architecture and Go collector design
Co-authored-by: tikazyq <[email protected]>
1 parent 30de655 commit 917da91

File tree

4 files changed

+1023
-31
lines changed

4 files changed

+1023
-31
lines changed

docs/design/ai-agent-observability-design.md

Lines changed: 82 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -56,66 +56,117 @@ AI coding agents are becoming ubiquitous in software development, but organizati
5656

5757
## Architecture Overview
5858

59+
**Architecture Decision**: **TypeScript + Go Hybrid** (finalized based on [performance analysis](./ai-agent-observability-performance-analysis.md))
60+
61+
**Rationale**:
62+
- **TypeScript**: Fast MVP development, MCP ecosystem, web UI (2 months to market)
63+
- **Go**: High-performance backend services (50-120K events/sec), efficient resource usage
64+
- **Benefits**: Best of both worlds - rapid iteration + production scalability
65+
- **Cost**: $335K (6 months) vs $278K (TS-only) - delivers 5-10x better performance
66+
5967
### High-Level Architecture
6068

6169
```
6270
┌─────────────────────────────────────────────────────────────────┐
63-
│ AI Coding Agents │
64-
│ (Copilot, Claude, Cursor, Gemini, Cline, Aider, etc.) │
65-
└────────────────────────┬────────────────────────────────────────┘
66-
67-
│ MCP Protocol / Agent SDKs
68-
69-
┌────────────────────────▼────────────────────────────────────────┐
70-
│ Agent Activity Collection Layer │
71-
│ • Event capture • Log aggregation • Real-time streaming │
72-
└────────────────────────┬────────────────────────────────────────┘
73-
74-
75-
┌─────────────────────────────────────────────────────────────────┐
76-
│ Processing & Analysis Engine │
77-
│ • Event parsing • Metric calculation • Pattern detection │
78-
└────────────────────────┬────────────────────────────────────────┘
79-
80-
71+
│ Developer Machine (Client) │
72+
│ ┌────────────────────────────────────────────────────────────┐ │
73+
│ │ AI Coding Agents (Copilot, Claude, Cursor, etc.) │ │
74+
│ └────────────────────┬───────────────────────────────────────┘ │
75+
│ │ Logs │
76+
│ ┌────────────────────▼───────────────────────────────────────┐ │
77+
│ │ Go Collector (~10-20MB binary) │ │
78+
│ │ • Log watcher • Event parser • Local buffer (SQLite) │ │
79+
│ │ • Batching (100 events/5s) • Offline support │ │
80+
│ └────────────────────┬───────────────────────────────────────┘ │
81+
└───────────────────────┼─────────────────────────────────────────┘
82+
│ HTTP/gRPC
83+
8184
┌─────────────────────────────────────────────────────────────────┐
82-
│ Storage & Indexing │
83-
│ • Time-series events • Metrics aggregation • Full-text search │
84-
└────────────────────────┬────────────────────────────────────────┘
85-
86-
87-
┌─────────────────────────────────────────────────────────────────┐
88-
│ Visualization & Analytics Layer │
89-
│ • Dashboards • Timeline views • Reports • Alerts │
85+
│ TypeScript API Gateway Layer │
86+
│ • Next.js API routes • MCP Server • Auth & session mgmt │
87+
│ • API orchestration (thin layer: 5-20ms latency) │
88+
└────────────┬───────────────────────────┬────────────────────────┘
89+
│ gRPC/REST │ gRPC/REST
90+
┌────────────▼────────────┐ ┌──────────▼──────────────────────┐
91+
│ Go Event Processor │ │ TypeScript Services │
92+
│ • 50-120K events/sec │ │ • User management │
93+
│ • Adapter registry │ │ • Project management │
94+
│ • Transformation │ │ • Devlog CRUD │
95+
│ • Batching & buffering │ │ • Business logic │
96+
└────────────┬────────────┘ └──────────┬──────────────────────┘
97+
│ │
98+
┌────────────▼───────────────────────────▼────────────────────────┐
99+
│ Go Real-time Stream Engine │
100+
│ • WebSocket server • Event broadcasting • Session monitoring │
101+
└────────────┬────────────────────────────────────────────────────┘
102+
103+
┌────────────▼────────────────────────────────────────────────────┐
104+
│ Go Analytics Engine │
105+
│ • Metrics aggregation • Pattern detection • Quality analysis │
106+
└────────────┬────────────────────────────────────────────────────┘
107+
108+
┌────────────▼────────────────────────────────────────────────────┐
109+
│ PostgreSQL + TimescaleDB │
110+
│ • agent_events (hypertable) • agent_sessions • Aggregations │
90111
└─────────────────────────────────────────────────────────────────┘
91112
```
92113

93114
### Component Architecture
94115

95116
```
96117
packages/
97-
├── core/ # Enhanced core with agent observability
118+
├── collector-go/ # NEW: Go client-side collector
119+
│ ├── cmd/collector/ # Main collector binary
120+
│ ├── internal/
121+
│ │ ├── adapters/ # Agent-specific log parsers
122+
│ │ ├── buffer/ # SQLite offline buffer
123+
│ │ ├── config/ # Configuration management
124+
│ │ └── watcher/ # File system watcher
125+
│ └── pkg/client/ # Backend HTTP/gRPC client
126+
127+
├── services-go/ # NEW: Go backend services
128+
│ ├── event-processor/ # Event processing service (50-120K/sec)
129+
│ ├── stream-engine/ # Real-time WebSocket streaming
130+
│ ├── analytics-engine/ # Metrics aggregation & pattern detection
131+
│ └── shared/ # Shared Go libraries & utilities
132+
133+
├── core/ # Enhanced core with agent observability (TypeScript)
98134
│ ├── agent-events/ # NEW: Agent event types and schemas
99135
│ ├── agent-collection/ # NEW: Event collection and ingestion
100136
│ ├── agent-analytics/ # NEW: Metrics and analysis engine
101137
│ └── services/ # Existing services + new agent services
102138
103-
├── mcp/ # MCP server with observability tools
139+
├── mcp/ # MCP server with observability tools (TypeScript)
104140
│ ├── tools/ # Existing + new agent monitoring tools
105-
│ └── collectors/ # NEW: Event collectors for different agents
141+
│ └── collectors/ # NEW: Collector control tools
106142
107-
├── ai/ # AI analysis for agent behavior
143+
├── ai/ # AI analysis for agent behavior (TypeScript)
108144
│ ├── pattern-detection/ # NEW: Identify patterns in agent behavior
109145
│ ├── quality-analysis/ # NEW: Code quality assessment
110146
│ └── recommendation-engine/ # NEW: Suggest improvements
111147
112-
└── web/ # Enhanced UI for observability
148+
└── web/ # Enhanced UI for observability (TypeScript/Next.js)
113149
├── dashboards/ # NEW: Agent activity dashboards
114150
├── timelines/ # NEW: Visual agent action timelines
115151
├── analytics/ # NEW: Performance and quality analytics
116152
└── reports/ # NEW: Custom reporting interface
117153
```
118154

155+
### Technology Stack Summary
156+
157+
| Component | Language | Rationale |
158+
|-----------|----------|-----------|
159+
| **Client Collector** | Go | Small binary (~10-20MB), cross-platform, efficient |
160+
| **Event Processing** | Go | High throughput (50-120K events/sec), low latency |
161+
| **Real-time Streaming** | Go | Efficient WebSocket handling, 50K+ connections |
162+
| **Analytics Engine** | Go | Fast aggregations, pattern detection performance |
163+
| **API Gateway** | TypeScript | MCP integration, rapid development, Next.js |
164+
| **Business Logic** | TypeScript | Fast iteration, existing codebase integration |
165+
| **Web UI** | TypeScript/Next.js | React ecosystem, server components |
166+
| **Database** | PostgreSQL + TimescaleDB | Time-series optimization, mature ecosystem |
167+
168+
```
169+
119170
## Core Features
120171
121172
### Phase 1: Agent Activity Collection & Storage (Foundation)

docs/design/ai-agent-observability-implementation-checklist.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,67 @@
44

55
This document provides a detailed, actionable checklist for implementing the AI Agent Observability features described in the [design document](./ai-agent-observability-design.md).
66

7+
**Architecture Decision**: TypeScript + Go Hybrid (finalized)
8+
- **TypeScript**: Web UI, MCP Server, API Gateway
9+
- **Go**: Client-side collector, Event processing, Real-time streaming, Analytics
10+
- See [Performance Analysis](./ai-agent-observability-performance-analysis.md) for detailed rationale
11+
12+
## Phase 0: Go Collector Setup (Week 0 - Parallel Track)
13+
14+
**Note**: This can be developed in parallel with Phase 1 TypeScript work.
15+
16+
### Go Collector Development
17+
18+
- [ ] **Project Setup**
19+
- [ ] Create `packages/collector-go/` directory
20+
- [ ] Initialize Go module: `go mod init github.com/codervisor/devlog/collector`
21+
- [ ] Set up Go project structure (cmd/, internal/, pkg/)
22+
- [ ] Configure cross-compilation (darwin, linux, windows)
23+
- [ ] Set up GitHub Actions for building binaries
24+
25+
- [ ] **Core Collector Implementation**
26+
- [ ] Implement log file watcher (fsnotify)
27+
- [ ] Create agent-specific log parsers (adapters pattern)
28+
- [ ] GitHub Copilot adapter
29+
- [ ] Claude Code adapter
30+
- [ ] Cursor adapter
31+
- [ ] Generic adapter (fallback)
32+
- [ ] Implement local SQLite buffer for offline support
33+
- [ ] Add event batching logic (100 events or 5s interval)
34+
- [ ] Implement HTTP/gRPC client for backend communication
35+
- [ ] Add retry logic with exponential backoff
36+
- [ ] Implement graceful shutdown
37+
38+
- [ ] **Configuration & Discovery**
39+
- [ ] Auto-detect agent log locations by OS
40+
- [ ] Load configuration from `~/.devlog/collector.json`
41+
- [ ] Support environment variables for config
42+
- [ ] Implement agent log discovery heuristics
43+
- [ ] Add config validation
44+
45+
- [ ] **Distribution & Installation**
46+
- [ ] Create npm package wrapper (`@codervisor/devlog-collector`)
47+
- [ ] Bundle platform-specific binaries in npm package
48+
- [ ] Create install script (post-install hook)
49+
- [ ] Add auto-start on system boot scripts
50+
- [ ] macOS: launchd plist
51+
- [ ] Linux: systemd service
52+
- [ ] Windows: Windows Service
53+
- [ ] Create uninstall script
54+
55+
- [ ] **Testing**
56+
- [ ] Unit tests for log parsers
57+
- [ ] Integration tests with mock backend
58+
- [ ] Test cross-platform compilation
59+
- [ ] Test offline buffering and recovery
60+
- [ ] Load testing (simulate high-volume logs)
61+
62+
- [ ] **Documentation**
63+
- [ ] Go collector architecture documentation
64+
- [ ] Build and development guide
65+
- [ ] Adapter development guide (for new agents)
66+
- [ ] Troubleshooting guide
67+
768
## Phase 1: Foundation (Weeks 1-4)
869

970
### Week 1: Core Data Models & Schema

docs/design/ai-agent-observability-quick-reference.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,11 @@
44

55
This quick reference provides a high-level summary of the AI Agent Observability features being added to the devlog project. For detailed information, see the [full design document](./ai-agent-observability-design.md).
66

7+
**Architecture**: TypeScript + Go Hybrid
8+
- **TypeScript**: Web UI, MCP Server, API Gateway, Business Logic
9+
- **Go**: Client collector (~10-20MB binary), Event processing (50-120K events/sec), Streaming, Analytics
10+
- See [Performance Analysis](./ai-agent-observability-performance-analysis.md) for rationale
11+
712
## Core Concepts
813

914
### What is AI Agent Observability?

0 commit comments

Comments
 (0)