|
| 1 | +# Viewer Enhancements: Timeline, Evaluation & Transcript Integration |
| 2 | + |
| 3 | +## Timeline Visualizer / Scrubber |
| 4 | + |
| 5 | +### Concept |
| 6 | +A horizontal timeline bar that visually represents the entire capture duration with: |
| 7 | +- **Transcript segments** as colored regions (synced with audio) |
| 8 | +- **Action markers** at specific timestamps (clicks, types, etc.) |
| 9 | +- **Current position indicator** (playhead) |
| 10 | +- **Segment boundaries** showing where transcript segments start/end |
| 11 | + |
| 12 | +### Visual Design |
| 13 | +``` |
| 14 | +┌─────────────────────────────────────────────────────────────────┐ |
| 15 | +│ ▼ ▼ ▼ ▼ │ |
| 16 | +│ [ Segment 1 ][ Segment 2 ][ Seg 3 ][ Segment 4 ] │ |
| 17 | +│ ● ● ● ● ● ● ● │ |
| 18 | +│ ↑playhead │ |
| 19 | +└─────────────────────────────────────────────────────────────────┘ |
| 20 | + ▼ = segment boundaries |
| 21 | + ● = action markers (clicks/types) |
| 22 | +``` |
| 23 | + |
| 24 | +### Interactions |
| 25 | +- **Click anywhere** → seek to that time (both steps and audio) |
| 26 | +- **Hover segment** → show transcript text tooltip |
| 27 | +- **Click segment** → highlight corresponding transcript text |
| 28 | +- **Action markers** → different colors by action type (click=red, type=green, scroll=purple) |
| 29 | + |
| 30 | +### Data Sources |
| 31 | +- `transcript.json` → segment boundaries and text |
| 32 | +- `baseData` → action timestamps and types |
| 33 | +- `audio.mp3` duration → timeline scale |
| 34 | + |
| 35 | +### Implementation |
| 36 | +```javascript |
| 37 | +function renderTimeline() { |
| 38 | + const totalDuration = audioElement.duration || baseData[baseData.length-1].time; |
| 39 | + |
| 40 | + // Render transcript segments as background regions |
| 41 | + transcriptSegments.forEach(seg => { |
| 42 | + const left = (seg.start / totalDuration) * 100; |
| 43 | + const width = ((seg.end - seg.start) / totalDuration) * 100; |
| 44 | + // Create segment div with tooltip |
| 45 | + }); |
| 46 | + |
| 47 | + // Render action markers |
| 48 | + baseData.forEach(step => { |
| 49 | + const left = (step.time / totalDuration) * 100; |
| 50 | + // Create marker div with action type color |
| 51 | + }); |
| 52 | +} |
| 53 | +``` |
| 54 | + |
| 55 | +--- |
| 56 | + |
| 57 | +## Current State |
| 58 | + |
| 59 | +### Training Dashboard |
| 60 | +- **Evaluation Samples gallery**: Grid of screenshots with H (human) and AI (predicted) click markers |
| 61 | +- **Filters**: Epoch dropdown, correctness filter (All/Correct/Incorrect) |
| 62 | +- **Per-sample info**: Distance metric, coordinates, raw model output |
| 63 | +- **Timing**: Samples evaluated at end of each epoch during training |
| 64 | + |
| 65 | +### Viewer Tab |
| 66 | +- **Full step playback**: All capture steps in sequence |
| 67 | +- **Checkpoint selector**: Switch between prediction sets (None, Epoch 1, 2, 3...) |
| 68 | +- **Per-step comparison**: Human vs AI action boxes, match indicator |
| 69 | +- **Click overlays**: H/AI markers on screenshot (toggleable) |
| 70 | + |
| 71 | +## Gap Analysis |
| 72 | + |
| 73 | +The training tab shows a **subset** of steps (those evaluated during training), while the viewer shows **all** steps. Users can't easily: |
| 74 | +1. See which steps were evaluated during training |
| 75 | +2. Jump to evaluated steps in the viewer |
| 76 | +3. Understand per-step accuracy over training epochs |
| 77 | + |
| 78 | +## Integration Options |
| 79 | + |
| 80 | +### Option A: Evaluation Badges in Step List |
| 81 | +Add visual badges to the viewer's step list indicating: |
| 82 | +- Whether the step was evaluated |
| 83 | +- Correctness status (green checkmark / red X) |
| 84 | +- Which epochs it was evaluated at |
| 85 | + |
| 86 | +**Pros**: Non-intrusive, works with existing UI |
| 87 | +**Cons**: Doesn't show evaluation progression over epochs |
| 88 | + |
| 89 | +### Option B: Evaluation Filter Mode |
| 90 | +Add a filter toggle to show only evaluated steps: |
| 91 | +- "Show All" / "Show Evaluated Only" toggle |
| 92 | +- When filtered, step list only shows evaluated steps |
| 93 | +- Step numbers preserved for context |
| 94 | + |
| 95 | +**Pros**: Focuses attention on evaluated steps |
| 96 | +**Cons**: Loses context of surrounding steps |
| 97 | + |
| 98 | +### Option C: Epoch Comparison View (Recommended) |
| 99 | +Extend the checkpoint dropdown to show per-step accuracy: |
| 100 | +- When checkpoint selected, show accuracy badge next to each step |
| 101 | +- Details panel shows prediction progression: Epoch 1 → 2 → 3 |
| 102 | +- Can see how model improved on specific steps over training |
| 103 | + |
| 104 | +**Implementation:** |
| 105 | +``` |
| 106 | +Step 7 [click] ✓ E1 ✗ E2 ✓ E3 <- badges showing correctness at each epoch |
| 107 | +``` |
| 108 | + |
| 109 | +### Option D: Side-by-Side Epoch Comparison |
| 110 | +New view mode showing same step across multiple epochs: |
| 111 | +- Split view: Epoch 1 | Epoch 2 | Epoch 3 |
| 112 | +- See prediction drift/improvement visually |
| 113 | +- Useful for debugging model behavior |
| 114 | + |
| 115 | +## Data Requirements |
| 116 | + |
| 117 | +The viewer already has access to `predictionsByCheckpoint` which contains predictions organized by epoch. To show evaluation status, we need: |
| 118 | + |
| 119 | +1. **evaluations** from training_log.json (already available) |
| 120 | +2. **Mapping** from evaluation sample_idx to step index |
| 121 | +3. **Per-epoch correctness** status |
| 122 | + |
| 123 | +## Recommended Implementation |
| 124 | + |
| 125 | +**Phase 1: Evaluation badges** |
| 126 | +- Add `eval-badge` class to step items that were evaluated |
| 127 | +- Show ✓/✗ based on correctness |
| 128 | +- Tooltip shows distance and epoch |
| 129 | + |
| 130 | +**Phase 2: Details panel enhancement** |
| 131 | +- When step was evaluated, show evaluation history |
| 132 | +- "Evaluated at: Epoch 1 (✗ 12.3px), Epoch 3 (✓ 4.1px)" |
| 133 | +- Show improvement trend |
| 134 | + |
| 135 | +**Phase 3: Gallery view toggle** |
| 136 | +- Button to switch between "Playback" and "Evaluation Gallery" views |
| 137 | +- Gallery view shows only evaluated steps in grid layout |
| 138 | +- Matches training dashboard eval panel visual style |
| 139 | + |
| 140 | +## Files to Modify |
| 141 | + |
| 142 | +- `openadapt_ml/training/trainer.py`: `_generate_unified_viewer_from_extracted_data()` |
| 143 | + - Add evaluation data to JS |
| 144 | + - Add badges to step list HTML |
| 145 | + - Enhance details panel |
| 146 | + |
| 147 | +- `openadapt_ml/cloud/local.py`: `regenerate_viewer()` |
| 148 | + - Pass evaluation data from training_log.json to viewer generation |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +## Implementation Priority |
| 153 | + |
| 154 | +1. **Timeline Visualizer** (High) - Core navigation improvement |
| 155 | + - Transcript segments as colored regions |
| 156 | + - Action markers by type |
| 157 | + - Click-to-seek with audio sync |
| 158 | + |
| 159 | +2. **Evaluation Badges** (Medium) - Training/viewer connection |
| 160 | + - ✓/✗ badges on evaluated steps |
| 161 | + - Tooltip with distance metric |
| 162 | + |
| 163 | +3. **Details Panel Enhancement** (Medium) - Deeper insights |
| 164 | + - Evaluation history across epochs |
| 165 | + - Improvement trend visualization |
| 166 | + |
| 167 | +4. **Gallery View Toggle** (Low) - Alternative view mode |
| 168 | + - Switch between playback and grid views |
| 169 | + - Matches training dashboard style |
0 commit comments