Skip to content

Commit 2b1a9bf

Browse files
author
Marvin Zhang
committed
Add Go collector roadmap and project overview; reprioritize Go collector in checklist
- Add GO_COLLECTOR_ROADMAP.md: 20-day, day-by-day implementation plan for the Go collector (phases, tasks, deps, success criteria, risk mitigation, distribution) - Add README.md for AI Agent Observability: project overview, architecture, progress, critical path, next actions, and key docs - Update ai-agent-observability-implementation-checklist.md: change Phase 0 to "Days 1-20" priority, link the new roadmap, mark current progress/state changes and update MCP/UI/tooling items to reflect Go-first focus
1 parent 8b064c1 commit 2b1a9bf

File tree

3 files changed

+612
-73
lines changed

3 files changed

+612
-73
lines changed
Lines changed: 276 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,276 @@
1+
# Go Collector Implementation Roadmap
2+
3+
**Priority**: HIGH - Foundation for production data collection
4+
**Target**: Lightweight binary (~10-20MB) that runs on developer machines
5+
**Status**: Not Started (0%)
6+
7+
## Phase 0: Project Setup (Days 1-2)
8+
9+
### Day 1: Go Project Structure
10+
- [ ] Create `packages/collector-go/` directory
11+
- [ ] Initialize Go module: `go mod init github.com/codervisor/devlog/collector`
12+
- [ ] Set up project structure:
13+
```
14+
packages/collector-go/
15+
├── cmd/
16+
│ └── collector/
17+
│ └── main.go # Entry point
18+
├── internal/
19+
│ ├── adapters/ # Agent-specific parsers
20+
│ ├── buffer/ # SQLite offline storage
21+
│ ├── config/ # Configuration management
22+
│ ├── watcher/ # File system watching
23+
│ └── client/ # Backend HTTP/gRPC client
24+
├── pkg/
25+
│ └── types/ # Public types/interfaces
26+
├── go.mod
27+
├── go.sum
28+
└── README.md
29+
```
30+
- [ ] Add initial dependencies:
31+
- `github.com/fsnotify/fsnotify` (file watching)
32+
- `github.com/mattn/go-sqlite3` (local buffer)
33+
- `github.com/sirupsen/logrus` (logging)
34+
- [ ] Create basic `main.go` with CLI structure
35+
36+
### Day 2: Development Tooling
37+
- [ ] Set up cross-compilation script (darwin/linux/windows)
38+
- [ ] Create Makefile for common tasks (build, test, clean)
39+
- [ ] Add `.gitignore` for Go binaries
40+
- [ ] Set up GitHub Actions workflow for building binaries
41+
- [ ] Create initial README with build instructions
42+
43+
## Phase 1: Core Infrastructure (Days 3-7)
44+
45+
### Day 3: Configuration System
46+
- [ ] Create `internal/config/config.go`
47+
- [ ] Define config structure (matches design doc)
48+
- [ ] Implement config loading from `~/.devlog/collector.json`
49+
- [ ] Add environment variable expansion support
50+
- [ ] Implement config validation
51+
- [ ] Add default values
52+
- [ ] Write unit tests
53+
54+
### Day 4: Log Discovery
55+
- [ ] Create `internal/watcher/discovery.go`
56+
- [ ] Implement OS-specific log path detection:
57+
- [ ] GitHub Copilot paths (darwin/linux/windows)
58+
- [ ] Claude Code paths
59+
- [ ] Cursor paths
60+
- [ ] Add glob pattern matching for version wildcards
61+
- [ ] Implement path expansion (home dir, env vars)
62+
- [ ] Write tests for each OS (mock filesystem)
63+
64+
### Day 5: File Watching
65+
- [ ] Create `internal/watcher/watcher.go`
66+
- [ ] Implement LogWatcher using fsnotify
67+
- [ ] Add file change detection (write events)
68+
- [ ] Handle file rotation
69+
- [ ] Add graceful error handling
70+
- [ ] Implement event buffering channel
71+
- [ ] Write integration tests
72+
73+
### Days 6-7: Local Buffer (SQLite)
74+
- [ ] Create `internal/buffer/buffer.go`
75+
- [ ] Define SQLite schema (events table, metadata table)
76+
- [ ] Implement Buffer initialization
77+
- [ ] Add `Store(event)` method
78+
- [ ] Add `GetUnsent(limit)` method
79+
- [ ] Add `MarkSent(eventIDs)` method
80+
- [ ] Implement size limit enforcement (cleanup old events)
81+
- [ ] Add deduplication logic
82+
- [ ] Write comprehensive tests
83+
- [ ] Test offline mode behavior
84+
85+
## Phase 2: Adapter System (Days 8-12)
86+
87+
### Day 8: Base Adapter Infrastructure
88+
- [ ] Create `internal/adapters/adapter.go` (interface definition)
89+
- [ ] Create `internal/adapters/registry.go`
90+
- [ ] Implement adapter registration
91+
- [ ] Implement auto-detection logic (`CanHandle()`)
92+
- [ ] Add adapter selection/routing
93+
- [ ] Define standard event types (matches TypeScript types)
94+
- [ ] Write base adapter tests
95+
96+
### Day 9: GitHub Copilot Adapter
97+
- [ ] Create `internal/adapters/copilot.go`
98+
- [ ] Research Copilot log format (collect samples)
99+
- [ ] Implement `AgentID()` method
100+
- [ ] Implement `CanHandle()` for Copilot detection
101+
- [ ] Implement `ParseEvent()` with JSON parsing
102+
- [ ] Map Copilot events to standard types:
103+
- completions → llm_response
104+
- edits → file_write
105+
- errors → error_encountered
106+
- [ ] Handle Copilot-specific metadata
107+
- [ ] Write tests with real log samples
108+
- [ ] Document Copilot log format
109+
110+
### Day 10: Claude Code Adapter
111+
- [ ] Create `internal/adapters/claude.go`
112+
- [ ] Research Claude Code log format
113+
- [ ] Implement adapter methods
114+
- [ ] Map Claude events to standard types
115+
- [ ] Handle tool_use events
116+
- [ ] Write tests with Claude log samples
117+
- [ ] Document Claude log format
118+
119+
### Days 11-12: Generic Adapter + Testing
120+
- [ ] Create `internal/adapters/generic.go` (fallback)
121+
- [ ] Implement best-effort parsing for unknown formats
122+
- [ ] Integration test with all adapters
123+
- [ ] Test adapter registry with multiple agents
124+
- [ ] Create adapter development guide
125+
- [ ] Add logging for unsupported log formats
126+
127+
## Phase 3: Backend Communication (Days 13-16)
128+
129+
### Day 13: HTTP Client
130+
- [ ] Create `internal/client/client.go`
131+
- [ ] Implement BackendClient struct
132+
- [ ] Add connection pooling
133+
- [ ] Add TLS/HTTPS support
134+
- [ ] Implement authentication (Bearer token)
135+
- [ ] Add request timeout configuration
136+
- [ ] Write client unit tests
137+
138+
### Day 14: Batch Manager
139+
- [ ] Create `internal/client/batch.go`
140+
- [ ] Implement BatchManager
141+
- [ ] Add event batching logic (100 events or 5s interval)
142+
- [ ] Implement gzip compression
143+
- [ ] Add batch size optimization
144+
- [ ] Handle batch failures gracefully
145+
- [ ] Write batching tests
146+
147+
### Day 15: Retry Logic
148+
- [ ] Implement exponential backoff
149+
- [ ] Add max retry limit
150+
- [ ] Handle network failures
151+
- [ ] Implement circuit breaker pattern
152+
- [ ] Add retry statistics/metrics
153+
- [ ] Test with unreliable network simulation
154+
155+
### Day 16: End-to-End Integration
156+
- [ ] Wire all components together in `main.go`
157+
- [ ] Implement graceful shutdown (SIGINT/SIGTERM)
158+
- [ ] Add startup validation
159+
- [ ] Test complete flow: watch → parse → buffer → send
160+
- [ ] Test offline → online transition
161+
- [ ] Performance profiling
162+
163+
## Phase 4: Distribution (Days 17-20)
164+
165+
### Day 17: Build System
166+
- [ ] Create cross-compilation script
167+
- [ ] Build for all platforms:
168+
- darwin/amd64
169+
- darwin/arm64
170+
- linux/amd64
171+
- linux/arm64
172+
- windows/amd64
173+
- [ ] Optimize binary size (strip symbols, UPX compression)
174+
- [ ] Test binaries on each platform
175+
- [ ] Measure binary sizes
176+
177+
### Day 18: NPM Package
178+
- [ ] Create `packages/collector-npm/` directory
179+
- [ ] Create `package.json` for `@codervisor/devlog-collector`
180+
- [ ] Add post-install script
181+
- [ ] Bundle platform-specific binaries
182+
- [ ] Create platform detection logic
183+
- [ ] Test npm install on all platforms
184+
- [ ] Publish to npm (test registry first)
185+
186+
### Day 19: Auto-start Configuration
187+
- [ ] Create macOS launchd plist template
188+
- [ ] Create Linux systemd service template
189+
- [ ] Create Windows service installer (optional)
190+
- [ ] Add install script for auto-start setup
191+
- [ ] Add uninstall script
192+
- [ ] Test auto-start on each platform
193+
- [ ] Document manual setup steps
194+
195+
### Day 20: Documentation
196+
- [ ] Write comprehensive README
197+
- [ ] Add installation guide
198+
- [ ] Document configuration options
199+
- [ ] Add troubleshooting section
200+
- [ ] Create architecture diagram
201+
- [ ] Document performance characteristics
202+
- [ ] Add contribution guide for new adapters
203+
204+
## Testing Strategy
205+
206+
### Unit Tests
207+
- [ ] All adapters (with real log samples)
208+
- [ ] Buffer operations
209+
- [ ] Config loading and validation
210+
- [ ] Event parsing and transformation
211+
212+
### Integration Tests
213+
- [ ] Full pipeline: watch → parse → buffer → send
214+
- [ ] Multi-agent concurrent collection
215+
- [ ] Offline mode and recovery
216+
- [ ] Error handling and retry
217+
218+
### Performance Tests
219+
- [ ] Measure event processing throughput
220+
- [ ] Test with high-volume log generation
221+
- [ ] Memory usage profiling
222+
- [ ] CPU usage monitoring
223+
- [ ] Battery impact assessment (macOS)
224+
225+
### Platform Tests
226+
- [ ] macOS (Intel + Apple Silicon)
227+
- [ ] Linux (Ubuntu, Fedora)
228+
- [ ] Windows 10/11
229+
230+
## Success Criteria
231+
232+
- [ ] Binary size < 20MB (uncompressed)
233+
- [ ] Memory usage < 50MB (typical)
234+
- [ ] CPU usage < 1% (idle), < 5% (active)
235+
- [ ] Event processing > 1K events/sec
236+
- [ ] Startup time < 1 second
237+
- [ ] Works offline, syncs when online
238+
- [ ] Handles log rotation gracefully
239+
- [ ] Cross-platform compatibility verified
240+
- [ ] NPM package installable and functional
241+
242+
## Risk Mitigation
243+
244+
### Technical Risks
245+
- **Log format changes**: Adapters may break with agent updates
246+
- Mitigation: Version detection, graceful fallbacks, monitoring
247+
248+
- **Platform-specific issues**: File paths, permissions vary by OS
249+
- Mitigation: Extensive testing, clear error messages
250+
251+
- **Performance impact**: Collector shouldn't slow down development
252+
- Mitigation: Benchmarking, resource limits, efficient algorithms
253+
254+
### Operational Risks
255+
- **User adoption**: Developers may resist installing collectors
256+
- Mitigation: Easy install (npm), clear value proposition, minimal footprint
257+
258+
- **Privacy concerns**: Developers may worry about data collection
259+
- Mitigation: Clear documentation, opt-in, local-first design, data controls
260+
261+
## Timeline Summary
262+
263+
- **Days 1-2**: Setup (10%)
264+
- **Days 3-7**: Core Infrastructure (25%)
265+
- **Days 8-12**: Adapters (25%)
266+
- **Days 13-16**: Backend Communication (20%)
267+
- **Days 17-20**: Distribution (20%)
268+
269+
**Total: ~20 days (4 weeks)** for production-ready collector
270+
271+
## Next Actions
272+
273+
1. Start with Day 1 setup
274+
2. Get basic skeleton compiling and running
275+
3. Implement one adapter (Copilot) end-to-end as proof of concept
276+
4. Iterate based on learnings

0 commit comments

Comments
 (0)