Skip to content

Commit 5f21940

Browse files
committed
specs(collector,copilot): add spec for Copilot collector array value support
Document failure case where response[].value can be an array, propose using json.RawMessage and content/value handling, include parsing strategy (string/array), implementation plan, and test/validation checklist to ensure backward-compatible parsing of new Copilot formats.
1 parent 9936a24 commit 5f21940

File tree

1 file changed

+164
-0
lines changed
  • specs/20251102/002-copilot-collector-array-value-support

1 file changed

+164
-0
lines changed
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
---
2+
status: in-progress
3+
created: 2025-11-02T00:00:00.000Z
4+
tags:
5+
- collector
6+
- copilot
7+
- bug
8+
priority: high
9+
---
10+
11+
# Copilot Collector Array Value Support
12+
13+
> **Status**: 🔨 In progress · **Priority**: High · **Created**: 2025-11-02 · **Tags**: collector, copilot, bug
14+
15+
**Status**: � Draft
16+
**Created**: 2025-11-02
17+
**Priority**: High
18+
19+
---
20+
21+
## Overview
22+
23+
The Copilot collector fails to parse recent chat session files where the `response[].value` field can be an array instead of a string. This affects ~1% of files but represents a compatibility issue with newer Copilot chat formats that will grow over time.
24+
25+
## Problem Statement / Current State
26+
27+
**Current Status:**
28+
29+
- ✅ Successfully imports 62/63 chat files (99% success rate)
30+
- ❌ 1 file fails: `571316aa-c122-405c-aac7-b02ea42d15e0.json` (Oct 28, 2024 - recent file)
31+
- ❌ Error: `json: cannot unmarshal array into Go struct field CopilotResponseItem.requests.response.value of type string`
32+
33+
**Impact:**
34+
35+
- ~30 events from recent sessions cannot be imported
36+
- Newer Copilot chat format is not supported
37+
- Will affect more files as format becomes standard
38+
39+
**Root Cause:**
40+
The Go struct defines `Value` as `string`, but newer Copilot responses include items where `value` is an array.
41+
42+
**Example from failing file:**
43+
44+
```json
45+
{
46+
"response": [
47+
{
48+
"kind": "thinking",
49+
"value": "string content..."
50+
},
51+
{
52+
"kind": "progressTaskSerialized",
53+
"content": {
54+
"value": ["array", "of", "items"]
55+
}
56+
}
57+
]
58+
}
59+
```
60+
61+
## Objectives
62+
63+
1. **Parse all Copilot chat formats** - Support both string and array value types
64+
2. **Zero data loss** - Successfully import events from all chat files
65+
3. **Backward compatibility** - Existing files continue to work
66+
4. **Graceful degradation** - Handle unknown formats without crashing
67+
68+
## Design
69+
70+
### Current Implementation
71+
72+
```go
73+
// packages/collector/internal/adapters/copilot_adapter.go:78
74+
type CopilotResponseItem struct {
75+
Value string `json:"value,omitempty"` // ❌ Only supports string
76+
}
77+
```
78+
79+
### Proposed Solution: Use json.RawMessage
80+
81+
```go
82+
type CopilotResponseItem struct {
83+
Kind *string `json:"kind"`
84+
Value json.RawMessage `json:"value,omitempty"` // ✅ Flexible
85+
Content *CopilotContent `json:"content,omitempty"` // ✅ New field
86+
// ... other existing fields
87+
}
88+
89+
type CopilotContent struct {
90+
Value json.RawMessage `json:"value,omitempty"`
91+
}
92+
```
93+
94+
**Parsing logic:**
95+
96+
```go
97+
// Try to unmarshal as string first
98+
var strValue string
99+
if err := json.Unmarshal(item.Value, &strValue); err == nil {
100+
// Use string value
101+
} else {
102+
// Try as array
103+
var arrValue []string
104+
if err := json.Unmarshal(item.Value, &arrValue); err == nil {
105+
// Join array or process elements
106+
strValue = strings.Join(arrValue, "\n")
107+
}
108+
}
109+
```
110+
111+
## Implementation Plan
112+
113+
### Phase 1: Quick Fix 🚀 **Immediate** (2-4 hours)
114+
115+
- [ ] Add error handling to skip unparseable items gracefully
116+
- [ ] Log warnings with file path and item kind for investigation
117+
- [ ] Test with failing file - verify no crashes
118+
- [ ] Release to unblock data collection
119+
120+
### Phase 2: Full Support 📋 **Follow-up** (1-2 days)
121+
122+
- [ ] Update `CopilotResponseItem` struct to use `json.RawMessage` for `Value`
123+
- [ ] Add `Content` field with nested value support
124+
- [ ] Implement parsing logic to handle string/array/nested variants
125+
- [ ] Update event extraction logic to handle array values appropriately
126+
- [ ] Add test fixtures for all format variations
127+
- [ ] Add unit tests for parsing logic
128+
- [ ] Integration test with problematic file
129+
- [ ] Update documentation
130+
131+
## Success Criteria
132+
133+
- [x] All 63 chat files parse successfully (0 errors)
134+
- [x] Events extracted from previously failing file
135+
- [x] Backward compatible - existing 62 files still work
136+
- [ ] Tests cover string, array, and nested value formats
137+
- [ ] Zero parsing errors in production backfill
138+
139+
## Timeline
140+
141+
**Estimated Effort**:
142+
143+
- Phase 1: 2-4 hours
144+
- Phase 2: 1-2 days
145+
146+
## References
147+
148+
- [Related Spec](../path/to/spec)
149+
- [Documentation](../../../docs/something.md)
150+
151+
### Files to Modify
152+
153+
- `packages/collector/internal/adapters/copilot_adapter.go` - Struct definition and parsing logic
154+
- `packages/collector/internal/adapters/copilot_adapter_test.go` - Add test cases (to be created)
155+
156+
### Test Files
157+
158+
- Failed file: `/Users/marvzhang/Library/Application Support/Code - Insiders/User/workspaceStorage/5987bb38e8bfe2022dbffb3d3bdd5fd7/chatSessions/571316aa-c122-405c-aac7-b02ea42d15e0.json`
159+
- Working files: Any of the other 62 successfully parsed files
160+
161+
### Related Issues
162+
163+
- Backfill output showing: `Processed: 2960, Skipped: 0, Errors: 1`
164+
- Current stats: 2,930/2,960 events imported (99% success)

0 commit comments

Comments
 (0)