|
| 1 | +--- |
| 2 | +status: planned |
| 3 | +created: '2025-12-05' |
| 4 | +tags: |
| 5 | + - architecture |
| 6 | + - collector |
| 7 | + - sync |
| 8 | + - storage |
| 9 | +priority: medium |
| 10 | +created_at: '2025-12-05T05:44:09.643Z' |
| 11 | +depends_on: |
| 12 | + - 016-automatic-historical-sync |
| 13 | +updated_at: '2025-12-05T05:44:09.651Z' |
| 14 | +--- |
| 15 | + |
| 16 | +# Local-First Collector Architecture |
| 17 | + |
| 18 | +> **Status**: 🗓️ Planned · **Priority**: Medium · **Created**: 2025-12-05 · **Depends on**: 016-automatic-historical-sync |
| 19 | +
|
| 20 | +## Problem Statement |
| 21 | + |
| 22 | +Current architecture tries to send events directly to remote, buffering only on failure. This creates issues: |
| 23 | + |
| 24 | +1. **Privacy**: All data goes to remote by default - no user control |
| 25 | +2. **Offline**: Collection fails when network is unavailable |
| 26 | +3. **Single destination**: Can't route work → team server, personal → private |
| 27 | +4. **No local querying**: Can't explore data before sharing |
| 28 | + |
| 29 | +## Design |
| 30 | + |
| 31 | +### Core Principle: Collect Local, Export Selective |
| 32 | + |
| 33 | +Inspired by Fluent Bit, Vector, and OpenTelemetry patterns: |
| 34 | + |
| 35 | +``` |
| 36 | +┌─────────────────────────────────────────────────────────────┐ |
| 37 | +│ DEVLOG COLLECTOR │ |
| 38 | +├─────────────────────────────────────────────────────────────┤ |
| 39 | +│ LAYER 1: COLLECT (always on, all workspaces) │ |
| 40 | +│ Agent Logs → Parser → Local SQLite │ |
| 41 | +│ • Works offline │ |
| 42 | +│ • All data captured │ |
| 43 | +│ • No remote dependency │ |
| 44 | +├─────────────────────────────────────────────────────────────┤ |
| 45 | +│ LAYER 2: EXPORT (selective, multi-destination) │ |
| 46 | +│ Local SQLite → Remote(s) based on routing rules │ |
| 47 | +│ • Workspace pattern matching │ |
| 48 | +│ • Multiple remotes │ |
| 49 | +│ • Auto-sync or manual │ |
| 50 | +└─────────────────────────────────────────────────────────────┘ |
| 51 | +``` |
| 52 | + |
| 53 | +### Data Flow |
| 54 | + |
| 55 | +``` |
| 56 | +Agent Logs Local Store Remote(s) |
| 57 | + │ │ │ |
| 58 | + ▼ ▼ ▼ |
| 59 | +┌─────────┐ ┌──────────┐ ┌─────────┐ |
| 60 | +│ Copilot │──┐ │ │ ┌─────▶│ Team │ |
| 61 | +│ Logs │ │ │ SQLite │ │ │ Server │ |
| 62 | +└─────────┘ │ Parse │ events │ Export └─────────┘ |
| 63 | + ├──────────▶│ .db │─────┤ |
| 64 | +┌─────────┐ │ │ │ │ ┌─────────┐ |
| 65 | +│ Claude │──┤ │ Cursors │ └─────▶│Personal │ |
| 66 | +│ Logs │ │ │ .db │ │ Server │ |
| 67 | +└─────────┘ │ └──────────┘ └─────────┘ |
| 68 | + │ │ |
| 69 | +┌─────────┐ │ │ Query locally |
| 70 | +│ Cursor │──┘ ▼ |
| 71 | +│ Logs │ ┌──────────┐ |
| 72 | +└─────────┘ │ devlog │ |
| 73 | + │ query │ |
| 74 | + └──────────┘ |
| 75 | +``` |
| 76 | + |
| 77 | +### Configuration Schema |
| 78 | + |
| 79 | +```yaml |
| 80 | +# ~/.devlog/config.yaml |
| 81 | + |
| 82 | +collect: |
| 83 | + enabled: true |
| 84 | + agents: [copilot, claude, cursor] |
| 85 | + |
| 86 | +storage: |
| 87 | + path: ~/.devlog/data/ |
| 88 | + events_db: events.db |
| 89 | + cursors_db: cursors.db |
| 90 | + retention_days: 365 |
| 91 | + |
| 92 | +export: |
| 93 | + remotes: |
| 94 | + # Team remote - auto-sync work projects |
| 95 | + team: |
| 96 | + url: https://devlog.company.com |
| 97 | + api_key: ${DEVLOG_TEAM_API_KEY} |
| 98 | + auto_sync: true |
| 99 | + sync_interval: 30s |
| 100 | + workspaces: |
| 101 | + include: |
| 102 | + - "~/work/**" |
| 103 | + - "~/company/**" |
| 104 | + exclude: |
| 105 | + - "**/personal/**" |
| 106 | + - "**/scratch/**" |
| 107 | + |
| 108 | + # Personal remote - manual export only |
| 109 | + personal: |
| 110 | + url: https://my-devlog.io |
| 111 | + api_key: ${DEVLOG_PERSONAL_API_KEY} |
| 112 | + auto_sync: false |
| 113 | + workspaces: |
| 114 | + include: |
| 115 | + - "~/personal/**" |
| 116 | + - "~/side-projects/**" |
| 117 | + |
| 118 | +# Default: collect everything, export nothing (opt-in) |
| 119 | +# Or: collect everything, export to default remote (opt-out) |
| 120 | +export_default: none # or "team" |
| 121 | +``` |
| 122 | +
|
| 123 | +### Two Cursor Types |
| 124 | +
|
| 125 | +```go |
| 126 | +// Collection cursor: "What have I parsed from log files?" |
| 127 | +type CollectCursor struct { |
| 128 | + AgentName string // "github-copilot" |
| 129 | + SourcePath string // "/path/to/chatSessions/abc.json" |
| 130 | + LastByteOffset int64 // Resume position in file |
| 131 | + LastEventTime time.Time // Latest event timestamp seen |
| 132 | +} |
| 133 | + |
| 134 | +// Export cursor: "What have I sent to remote X?" |
| 135 | +type ExportCursor struct { |
| 136 | + RemoteName string // "team" |
| 137 | + LastEventID string // Last event ID sent |
| 138 | + LastExportTime time.Time // When last export happened |
| 139 | + Status string // "synced", "pending", "error" |
| 140 | +} |
| 141 | +``` |
| 142 | + |
| 143 | +This separation allows: |
| 144 | +- Collection runs independently of export |
| 145 | +- Add new remote later → backfill just that remote |
| 146 | +- Remote down → collection continues, export retries |
| 147 | + |
| 148 | +### CLI Commands |
| 149 | + |
| 150 | +```bash |
| 151 | +# Start collector (collection always runs) |
| 152 | +devlog start |
| 153 | + |
| 154 | +# Check local data |
| 155 | +devlog query --workspace ~/work/myproject --last 7d |
| 156 | +devlog stats --local |
| 157 | + |
| 158 | +# Export commands |
| 159 | +devlog export --remote team # Manual export to team |
| 160 | +devlog export --remote personal --all # Export all matching events |
| 161 | +devlog export --status # Show export status per remote |
| 162 | + |
| 163 | +# Remote management |
| 164 | +devlog remote add personal https://my.devlog.io |
| 165 | +devlog remote list |
| 166 | +devlog remote test team # Test connectivity |
| 167 | +``` |
| 168 | + |
| 169 | +## Plan |
| 170 | + |
| 171 | +### Phase 1: Refactor Storage Layer |
| 172 | + |
| 173 | +- [ ] Create `internal/storage/` package |
| 174 | +- [ ] Move buffer.go logic to storage layer |
| 175 | +- [ ] Add events table with workspace metadata |
| 176 | +- [ ] Add collect_cursors table |
| 177 | +- [ ] Add export_cursors table (per-remote) |
| 178 | + |
| 179 | +### Phase 2: Decouple Collection from Export |
| 180 | + |
| 181 | +- [ ] Collection writes to local SQLite only |
| 182 | +- [ ] Remove direct client.SendEvent from collection path |
| 183 | +- [ ] Add workspace path extraction to events |
| 184 | +- [ ] Update BackfillManager to use new storage |
| 185 | + |
| 186 | +### Phase 3: Export Manager |
| 187 | + |
| 188 | +- [ ] Create `internal/export/` package |
| 189 | +- [ ] Implement ExportManager with per-remote cursors |
| 190 | +- [ ] Add workspace pattern matching (glob) |
| 191 | +- [ ] Add background export goroutine |
| 192 | +- [ ] Implement retry with exponential backoff |
| 193 | + |
| 194 | +### Phase 4: Multi-Remote Configuration |
| 195 | + |
| 196 | +- [ ] Extend config schema for multiple remotes |
| 197 | +- [ ] Add remote management CLI commands |
| 198 | +- [ ] Add `devlog export` CLI commands |
| 199 | +- [ ] Add export status/progress reporting |
| 200 | + |
| 201 | +### Phase 5: Local Query Support |
| 202 | + |
| 203 | +- [ ] Add `devlog query` command |
| 204 | +- [ ] Add `devlog stats --local` command |
| 205 | +- [ ] Simple filtering by workspace, time range, event type |
| 206 | + |
| 207 | +## Test |
| 208 | + |
| 209 | +- [ ] Collection works with no network (airplane mode) |
| 210 | +- [ ] Events stored locally with correct workspace metadata |
| 211 | +- [ ] Export sends only matching workspaces per remote |
| 212 | +- [ ] Add remote later → can backfill historical data |
| 213 | +- [ ] Remote down → collection continues, export retries |
| 214 | +- [ ] Multiple remotes receive correct filtered data |
| 215 | +- [ ] `devlog query` returns local data correctly |
| 216 | + |
| 217 | +## Notes |
| 218 | + |
| 219 | +### Industry Patterns Applied |
| 220 | + |
| 221 | +| Pattern | Source | How We Apply | |
| 222 | +|---------|--------|--------------| |
| 223 | +| Memory + disk buffer | Fluent Bit | SQLite as durable store | |
| 224 | +| Fan-out to sinks | OTel, Vector | Multiple remotes | |
| 225 | +| Tag-based routing | Fluent Bit | Workspace pattern matching | |
| 226 | +| Cursor/checkpoint | All | Per-source + per-destination cursors | |
| 227 | +| Backpressure handling | Fluent Bit | Local buffer absorbs spikes | |
| 228 | + |
| 229 | +### Migration Path |
| 230 | + |
| 231 | +From current architecture: |
| 232 | +1. Spec 016 adds auto-sync (keep current single-remote) |
| 233 | +2. This spec adds local-first + multi-remote |
| 234 | +3. Existing users: seamless upgrade (local DB created automatically) |
| 235 | +4. New users: opt-in to remotes (privacy by default) |
| 236 | + |
| 237 | +### Open Questions |
| 238 | + |
| 239 | +1. **Default behavior**: Collect all + export none (privacy) vs export to default remote? |
| 240 | +2. **Query language**: Simple filters or full SQL access to local DB? |
| 241 | +3. **Storage limits**: Auto-prune after N days, or let user manage? |
| 242 | +4. **Encryption**: Encrypt local SQLite at rest? |
0 commit comments