Skip to content

Commit 27c4130

Browse files
committed
docs(viewer): add timeline visualizer and eval integration design
1 parent bfafcba commit 27c4130

File tree

1 file changed

+169
-0
lines changed

1 file changed

+169
-0
lines changed

docs/viewer_eval_integration.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Viewer Enhancements: Timeline, Evaluation & Transcript Integration
2+
3+
## Timeline Visualizer / Scrubber
4+
5+
### Concept
6+
A horizontal timeline bar that visually represents the entire capture duration with:
7+
- **Transcript segments** as colored regions (synced with audio)
8+
- **Action markers** at specific timestamps (clicks, types, etc.)
9+
- **Current position indicator** (playhead)
10+
- **Segment boundaries** showing where transcript segments start/end
11+
12+
### Visual Design
13+
```
14+
┌─────────────────────────────────────────────────────────────────┐
15+
│ ▼ ▼ ▼ ▼ │
16+
│ [ Segment 1 ][ Segment 2 ][ Seg 3 ][ Segment 4 ] │
17+
│ ● ● ● ● ● ● ● │
18+
│ ↑playhead │
19+
└─────────────────────────────────────────────────────────────────┘
20+
▼ = segment boundaries
21+
● = action markers (clicks/types)
22+
```
23+
24+
### Interactions
25+
- **Click anywhere** → seek to that time (both steps and audio)
26+
- **Hover segment** → show transcript text tooltip
27+
- **Click segment** → highlight corresponding transcript text
28+
- **Action markers** → different colors by action type (click=red, type=green, scroll=purple)
29+
30+
### Data Sources
31+
- `transcript.json` → segment boundaries and text
32+
- `baseData` → action timestamps and types
33+
- `audio.mp3` duration → timeline scale
34+
35+
### Implementation
36+
```javascript
37+
function renderTimeline() {
38+
const totalDuration = audioElement.duration || baseData[baseData.length-1].time;
39+
40+
// Render transcript segments as background regions
41+
transcriptSegments.forEach(seg => {
42+
const left = (seg.start / totalDuration) * 100;
43+
const width = ((seg.end - seg.start) / totalDuration) * 100;
44+
// Create segment div with tooltip
45+
});
46+
47+
// Render action markers
48+
baseData.forEach(step => {
49+
const left = (step.time / totalDuration) * 100;
50+
// Create marker div with action type color
51+
});
52+
}
53+
```
54+
55+
---
56+
57+
## Current State
58+
59+
### Training Dashboard
60+
- **Evaluation Samples gallery**: Grid of screenshots with H (human) and AI (predicted) click markers
61+
- **Filters**: Epoch dropdown, correctness filter (All/Correct/Incorrect)
62+
- **Per-sample info**: Distance metric, coordinates, raw model output
63+
- **Timing**: Samples evaluated at end of each epoch during training
64+
65+
### Viewer Tab
66+
- **Full step playback**: All capture steps in sequence
67+
- **Checkpoint selector**: Switch between prediction sets (None, Epoch 1, 2, 3...)
68+
- **Per-step comparison**: Human vs AI action boxes, match indicator
69+
- **Click overlays**: H/AI markers on screenshot (toggleable)
70+
71+
## Gap Analysis
72+
73+
The training tab shows a **subset** of steps (those evaluated during training), while the viewer shows **all** steps. Users can't easily:
74+
1. See which steps were evaluated during training
75+
2. Jump to evaluated steps in the viewer
76+
3. Understand per-step accuracy over training epochs
77+
78+
## Integration Options
79+
80+
### Option A: Evaluation Badges in Step List
81+
Add visual badges to the viewer's step list indicating:
82+
- Whether the step was evaluated
83+
- Correctness status (green checkmark / red X)
84+
- Which epochs it was evaluated at
85+
86+
**Pros**: Non-intrusive, works with existing UI
87+
**Cons**: Doesn't show evaluation progression over epochs
88+
89+
### Option B: Evaluation Filter Mode
90+
Add a filter toggle to show only evaluated steps:
91+
- "Show All" / "Show Evaluated Only" toggle
92+
- When filtered, step list only shows evaluated steps
93+
- Step numbers preserved for context
94+
95+
**Pros**: Focuses attention on evaluated steps
96+
**Cons**: Loses context of surrounding steps
97+
98+
### Option C: Epoch Comparison View (Recommended)
99+
Extend the checkpoint dropdown to show per-step accuracy:
100+
- When checkpoint selected, show accuracy badge next to each step
101+
- Details panel shows prediction progression: Epoch 1 → 2 → 3
102+
- Can see how model improved on specific steps over training
103+
104+
**Implementation:**
105+
```
106+
Step 7 [click] ✓ E1 ✗ E2 ✓ E3 <- badges showing correctness at each epoch
107+
```
108+
109+
### Option D: Side-by-Side Epoch Comparison
110+
New view mode showing same step across multiple epochs:
111+
- Split view: Epoch 1 | Epoch 2 | Epoch 3
112+
- See prediction drift/improvement visually
113+
- Useful for debugging model behavior
114+
115+
## Data Requirements
116+
117+
The viewer already has access to `predictionsByCheckpoint` which contains predictions organized by epoch. To show evaluation status, we need:
118+
119+
1. **evaluations** from training_log.json (already available)
120+
2. **Mapping** from evaluation sample_idx to step index
121+
3. **Per-epoch correctness** status
122+
123+
## Recommended Implementation
124+
125+
**Phase 1: Evaluation badges**
126+
- Add `eval-badge` class to step items that were evaluated
127+
- Show ✓/✗ based on correctness
128+
- Tooltip shows distance and epoch
129+
130+
**Phase 2: Details panel enhancement**
131+
- When step was evaluated, show evaluation history
132+
- "Evaluated at: Epoch 1 (✗ 12.3px), Epoch 3 (✓ 4.1px)"
133+
- Show improvement trend
134+
135+
**Phase 3: Gallery view toggle**
136+
- Button to switch between "Playback" and "Evaluation Gallery" views
137+
- Gallery view shows only evaluated steps in grid layout
138+
- Matches training dashboard eval panel visual style
139+
140+
## Files to Modify
141+
142+
- `openadapt_ml/training/trainer.py`: `_generate_unified_viewer_from_extracted_data()`
143+
- Add evaluation data to JS
144+
- Add badges to step list HTML
145+
- Enhance details panel
146+
147+
- `openadapt_ml/cloud/local.py`: `regenerate_viewer()`
148+
- Pass evaluation data from training_log.json to viewer generation
149+
150+
---
151+
152+
## Implementation Priority
153+
154+
1. **Timeline Visualizer** (High) - Core navigation improvement
155+
- Transcript segments as colored regions
156+
- Action markers by type
157+
- Click-to-seek with audio sync
158+
159+
2. **Evaluation Badges** (Medium) - Training/viewer connection
160+
- ✓/✗ badges on evaluated steps
161+
- Tooltip with distance metric
162+
163+
3. **Details Panel Enhancement** (Medium) - Deeper insights
164+
- Evaluation history across epochs
165+
- Improvement trend visualization
166+
167+
4. **Gallery View Toggle** (Low) - Alternative view mode
168+
- Switch between playback and grid views
169+
- Matches training dashboard style

0 commit comments

Comments
 (0)