diff --git a/CLAUDE.md b/CLAUDE.md index 63eef82..73bd34d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -12,6 +12,7 @@ When the user makes a request, identify which agent should handle it, read its S |---|---|---| | Create a dashboard, build a dashboard, set up analytics for a feature | **Dashboard Builder** | `agents/dashboard-builder/SKILL.md` | | Create a Linear issue, track work in Linear, file a bug/task | **Linear Issue Manager** | `agents/linear/SKILL.md` | +| Find events, discover events, document events, validate event tracking | **Events Agent** | `agents/events-agent/SKILL.md` | If the request doesn't clearly match an agent, ask the user which they need. diff --git a/README.md b/README.md index 422c3a6..1e7ec63 100644 --- a/README.md +++ b/README.md @@ -180,10 +180,36 @@ Creates feature dashboards in Hex Threads with a 4-phase workflow: See `agents/dashboard-builder/SKILL.md` for full workflow. +### Events Agent + +**Status:** NEW | **Trigger:** "Find events for {feature}", "Document events", "Validate event tracking" + +Discovers, documents, and validates events across three sources: +- **Event Registry** — `shared/event-registry.yaml` (documented events) +- **Codebase** — GitHub/local search for tracking code +- **Production** — `ltxstudio_user_all_actions` table (actual events firing) + +**Key Features:** +- ✓ Multi-source event discovery (registry, code, production) +- ✓ Event schema inspection and property analysis +- ✓ Validation reports (registry vs production comparison) +- ✓ New event detection (events firing in last 7 days not seen before) +- ✓ Event usage statistics (users, volume, recency) +- ✓ Event registry documentation formatting + +**Common Tasks:** +- Find all events for a feature +- Document new events in registry format +- Validate existing event tracking +- Detect deprecated or missing events +- Inspect event properties and schemas + +See `agents/events-agent/SKILL.md` for full workflow. + --- | Agent | Status | What it does | |-------|--------|-------------| -| Dashboard Builder | WIP | Creates feature dashboards in Hex Threads | -| Linear Issue Manager | Ready | Creates and manages Linear issues for analytics work across PA teams | | Dashboard Builder | WIP | Creates feature dashboards in Hex Threads with automated data quality validation | +| Linear Issue Manager | Ready | Creates and manages Linear issues for analytics work across PA teams | +| Events Agent | NEW | Discovers, documents, and validates events from registry, code, and production data | diff --git a/agents/events-agent/SKILL.md b/agents/events-agent/SKILL.md new file mode 100644 index 0000000..71c3481 --- /dev/null +++ b/agents/events-agent/SKILL.md @@ -0,0 +1,57 @@ +--- +name: events-agent +description: Routes event-related requests to specialized sub-agents for mapping events from designs/features or validating event tracking in production. +tags: [events, router] +--- + +# Events Agent + +## When to use + +- "Map events for {feature}" +- "Find events in design" +- "Validate event tracking" +- "Check if events are firing" +- "Document events for {feature}" + +## What it does + +Routes event-related requests to specialized sub-agents: + +| Task Type | Sub-Agent | File | +|-----------|-----------|------| +| Map events from designs/features | **Map Events** | `agents/events-agent/map-events/SKILL.md` | +| Validate event tracking in production | **Validate Events** | `agents/events-agent/validate-events/SKILL.md` | + +## How to route + +1. **Identify task type** from user request +2. **Read the appropriate sub-agent SKILL.md** +3. **Follow that sub-agent's workflow** + +## Examples + +| User says... | Route to... | +|-------------|-------------| +| "Map events for camera angle feature" | Map Events | +| "Find events in Figma design" | Map Events | +| "Validate timeline events are firing" | Validate Events | +| "Check if generate_video event is tracked correctly" | Validate Events | +| "Document events for new feature" | Map Events | + +## Reference files + +All event agents use: + +| File | Read when | +|------|-----------| +| `shared/event-registry.yaml` | Before documenting or validating events | +| `shared/bq-schema.md` | Before querying production data | +| `shared/product-context.md` | Before investigating features | + +## Rules + +- DO identify the task type before routing +- DO read the sub-agent's SKILL.md completely before executing +- DO NOT mix mapping and validation in one workflow — route separately +- DO check event registry first before mapping or validating diff --git a/agents/events-agent/SKILL.md.backup b/agents/events-agent/SKILL.md.backup new file mode 100644 index 0000000..a27bc69 --- /dev/null +++ b/agents/events-agent/SKILL.md.backup @@ -0,0 +1,367 @@ +--- +name: events-agent +description: Discovers, documents, and validates events in the LTX codebase. Helps find event names, understand event schemas, and create event documentation. +tags: [events, discovery, documentation, validation] +--- + +# Events Agent + +## When to use + +- "Find events for {feature}" +- "What events does {feature} fire?" +- "Document events for {feature}" +- "Validate event tracking for {feature}" +- "What's the schema for {event_name}?" +- "Search for event tracking code" +- "Add events to registry" + +## What it does + +Helps discover, document, and validate events across three sources: +1. **Event Registry** (`shared/event-registry.yaml`) - Known documented events +2. **Codebase** (GitHub/local search) - Actual event tracking code +3. **BigQuery** (`ltxstudio_user_all_actions`) - Events firing in production + +## Steps + +### 1. Gather Requirements + +Ask the user: +- **Feature name**: Which feature are you investigating? +- **Task type**: Discovery (find events), Documentation (add to registry), or Validation (verify tracking)? +- **Scope**: Specific event names to search for, or full feature exploration? +- **Output format**: List, table, or full event registry entry? + +### 2. Read Shared Knowledge + +Before searching for events: +- **`shared/event-registry.yaml`** - Check if events are already documented +- **`shared/bq-schema.md`** - Understand `ltxstudio_user_all_actions` table schema (action_name, action_category, event_name_detailed columns) +- **`shared/product-context.md`** - Understand feature context and business logic + +Key learnings: +- Table: `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +- Event columns: `action_name` (primary), `action_category` (groups), `event_name_detailed` (specific variants) +- Event registry structure: feature → events → {name, type, status, description} + +### 3. Search for Events + +Use all three sources to find events: + +#### A. Check Event Registry +```bash +grep -A 10 "{feature_name}" shared/event-registry.yaml +``` + +#### B. Search Codebase (if GitHub MCP available) +Look for: +- Event tracking calls: `track(`, `analytics.track(`, `sendEvent(` +- Event name constants: `EVENT_`, `ANALYTICS_` +- Feature-specific tracking files + +#### C. Query Production Data +```sql +SELECT + action_name, + action_category, + event_name_detailed, + COUNT(DISTINCT lt_id) AS unique_users, + COUNT(*) AS event_count, + MIN(date(action_ts)) AS first_seen, + MAX(date(action_ts)) AS last_seen +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + AND ( + action_name LIKE '%{feature}%' + OR action_category LIKE '%{feature}%' + ) +GROUP BY action_name, action_category, event_name_detailed +ORDER BY event_count DESC +``` + +### 4. Analyze Event Schema + +For each discovered event, determine: +- **Name**: Exact event name (from action_name or event_name_detailed) +- **Category**: Event grouping (from action_category) +- **Type**: usage, generation, navigation, rlhf, subscription, etc. +- **Status**: active, deprecated, planned +- **Properties**: What properties/columns are populated for this event +- **Usage pattern**: When does this event fire? (button click, page load, completion, etc.) + +Query to inspect event properties: +```sql +SELECT * +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = '{event_name}' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +LIMIT 100 +``` + +### 5. Document Events + +**If adding to registry**, format as: +```yaml +{feature_name}: + description: "{Feature description}" + events: + - name: "{event_name}" + type: "{usage|generation|navigation|subscription}" + status: "{active|deprecated|planned}" + description: "{What this event tracks}" + properties: + - "{key_property_1}" + - "{key_property_2}" +``` + +**If creating a summary**, format as: +| Event Name | Category | Type | Usage | Last Seen | +|------------|----------|------|-------|-----------| +| ... | ... | ... | ... | ... | + +### 6. Validate Event Tracking + +**For validation tasks**, check: +1. **Registry vs Production** - Are documented events actually firing? +2. **Coverage** - Are there production events not in the registry? +3. **Naming consistency** - Do event names follow conventions? +4. **Volume sanity** - Are event counts reasonable for the user base? +5. **Recency** - When did each event last fire? + +Validation query: +```sql +WITH registry_events AS ( + -- Manually list events from registry + SELECT event_name FROM UNNEST([ + 'generate_image', 'generate_video', 'click_camera_angle' + ]) AS event_name +), +production_events AS ( + SELECT DISTINCT action_name AS event_name + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +) +SELECT + COALESCE(r.event_name, p.event_name) AS event_name, + CASE + WHEN r.event_name IS NULL THEN 'NOT IN REGISTRY' + WHEN p.event_name IS NULL THEN 'NOT FIRING' + ELSE 'OK' + END AS status +FROM registry_events r +FULL OUTER JOIN production_events p ON r.event_name = p.event_name +ORDER BY status, event_name +``` + +### 7. Present Findings + +Format results with: +- **Summary**: Number of events found, sources checked +- **Event List**: Table or YAML format with all discovered events +- **Coverage Report** (if validation): Registry vs production comparison +- **Recommendations**: Missing documentation, deprecated events to remove, naming inconsistencies + +## Quick Reference Queries + +### Find All Events for a Feature +```sql +SELECT + action_name, + action_category, + COUNT(DISTINCT lt_id) AS users, + COUNT(*) AS events, + MAX(date(action_ts)) AS last_seen +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + AND action_name LIKE '%{feature}%' +GROUP BY action_name, action_category +ORDER BY events DESC +``` + +### Event Properties Inspection +```sql +SELECT + action_name, + action_category, + event_name_detailed, + model_gen_type, + process_name, + -- Add other relevant columns based on event type + COUNT(*) AS occurrences +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = '{event_name}' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +GROUP BY 1, 2, 3, 4, 5 +LIMIT 100 +``` + +### Event Firing Frequency (Daily) +```sql +SELECT + date(action_ts) AS dt, + action_name, + COUNT(DISTINCT lt_id) AS unique_users, + COUNT(*) AS event_count +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 14 DAY) + AND action_name IN ('event1', 'event2', 'event3') +GROUP BY dt, action_name +ORDER BY dt DESC, action_name +``` + +### New Events Detection (Last 7 Days) +```sql +WITH recent AS ( + SELECT DISTINCT action_name + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +), +historical AS ( + SELECT DISTINCT action_name + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 60 DAY) + AND date(action_ts) < DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +) +SELECT + r.action_name AS new_event, + COUNT(*) AS event_count, + COUNT(DISTINCT lt_id) AS unique_users, + MIN(date(action_ts)) AS first_seen +FROM recent r +LEFT JOIN historical h ON r.action_name = h.action_name +INNER JOIN `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` a + ON r.action_name = a.action_name + AND date(a.action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +WHERE h.action_name IS NULL +GROUP BY r.action_name +ORDER BY event_count DESC +``` + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/event-registry.yaml` | Always - check existing documentation first | +| `shared/bq-schema.md` | Before writing SQL - understand table schema | +| `shared/product-context.md` | Before investigating features - understand business context | + +## Rules + +### Discovery + +- **DO** check all three sources (registry, code, production) for complete coverage +- **DO** use `action_name` as the primary event identifier +- **DO** check `event_name_detailed` for generation events (has specific variants like "generate_video_t2v") +- **DO** filter on `date(action_ts)` partition column for performance +- **DO** look back 30 days minimum to catch infrequent events +- **DO NOT** assume registry is complete - always validate against production + +### Documentation + +- **DO** follow event-registry.yaml structure exactly (name, type, status, description, properties) +- **DO** classify event types correctly: usage, generation, navigation, rlhf, subscription, token +- **DO** mark deprecated events with status: "deprecated" (not "inactive" or "removed") +- **DO** include business context in descriptions ("Fired when user clicks the Camera Angle button in the toolbar") +- **DO NOT** add events to registry without confirming they exist in production or code + +### Validation + +- **DO** report both "not in registry" AND "not firing" events +- **DO** check event recency - events not seen in 30+ days may be deprecated +- **DO** validate event volumes make sense (e.g., generation events should have reasonable counts for user base) +- **DO** flag naming inconsistencies (snake_case vs camelCase, inconsistent prefixes) +- **DO NOT** remove events from registry without user approval +- **DO NOT** assume low-volume events are broken - some features are used rarely + +### Query Best Practices + +- **DO** always filter on `date(action_ts)` partition column +- **DO** use `action_name LIKE '%pattern%'` for broad feature searches +- **DO** use exact `action_name = 'event'` for specific event analysis +- **DO** exclude today's incomplete data: `date(action_ts) < CURRENT_DATE()` +- **DO** use `COUNT(DISTINCT lt_id)` for unique user counts +- **DO NOT** filter `is_lt_team` (already excluded at table level) + +## Anti-Patterns + +| Anti-Pattern | Why | Do this instead | +|--------------|-----|-----------------| +| Only checking registry | Registry may be incomplete or outdated | Check all three sources (registry, code, production) | +| Searching without `date()` filter | Partition is TIMESTAMP, not DATE | Always wrap: `date(action_ts) >= ...` | +| Using `event_name` instead of `action_name` | `event_name` column doesn't exist in main table | Use `action_name` (primary) and `event_name_detailed` (for variants) | +| Not checking event recency | Event may be deprecated | Always check `MAX(date(action_ts))` | +| Adding undocumented events to registry | May add incorrect/test events | Validate event exists and understand its purpose first | +| Assuming all events have same properties | Different event types have different schemas | Inspect actual event rows to see populated columns | + +## Common Tasks + +### Task: "Find all events for camera angle feature" +1. Read `shared/event-registry.yaml` - check if camera_angle exists +2. Query production for `action_name LIKE '%camera_angle%'` +3. Search codebase for "camera_angle" tracking code +4. Present consolidated list with usage stats + +### Task: "Document new keyframe feature events" +1. Query production for recent events: `action_name LIKE '%keyframe%'` (last 30 days) +2. Inspect event rows to understand properties and usage +3. Search code to understand event firing logic +4. Format as YAML for event-registry.yaml +5. Present to user for approval before adding + +### Task: "Validate timeline events tracking" +1. List all timeline events from registry +2. Query production to check each event is firing +3. Query production to find any timeline events NOT in registry +4. Generate validation report: OK / NOT FIRING / NOT IN REGISTRY +5. Recommend additions/removals + +## Output Formats + +### Event Discovery Summary +``` +## Events Found: {feature_name} + +**Sources Checked:** +- Registry: {X events documented} +- Production: {Y events firing in last 30 days} +- Code: {Z tracking calls found} + +**Events:** + +| Event Name | Category | Users (30d) | Events (30d) | Last Seen | In Registry? | +|------------|----------|-------------|--------------|-----------|--------------| +| ... | ... | ... | ... | ... | ✓ / ✗ | +``` + +### Event Registry Entry +```yaml +{feature_name}: + description: "{Feature description}" + events: + - name: "{event_name}" + type: "{type}" + status: "active" + description: "{When this fires and what it tracks}" + properties: + - "{property_1}" + - "{property_2}" +``` + +### Validation Report +``` +## Event Tracking Validation: {feature_name} + +**Summary:** +- ✓ {X} events: OK (in registry, firing in production) +- ⚠️ {Y} events: NOT IN REGISTRY (firing but undocumented) +- ❌ {Z} events: NOT FIRING (documented but not seen in 30+ days) + +**Details:** + +| Event Name | Status | Users (30d) | Last Seen | Action | +|------------|--------|-------------|-----------|--------| +| ... | ✓ OK | 1,234 | 2026-03-03 | None | +| ... | ⚠️ NOT IN REGISTRY | 567 | 2026-03-03 | Add to registry | +| ... | ❌ NOT FIRING | 0 | 2026-01-15 | Mark deprecated or remove | +``` diff --git a/agents/events-agent/map-events/SKILL.md b/agents/events-agent/map-events/SKILL.md new file mode 100644 index 0000000..1c0189c --- /dev/null +++ b/agents/events-agent/map-events/SKILL.md @@ -0,0 +1,362 @@ +--- +name: map-events +description: Maps and documents events for features by discovering them from Figma designs, product requirements, codebase, and event registry. Routes to specialized mapping workflows including Figma MCP integration. +tags: [events, discovery, documentation, mapping] +--- + +# Map Events Agent + +## When to use + +- "Map events for {feature}" +- "Find events in {Figma design}" +- "Document events for new feature" +- "What events should we track for {feature}?" +- "Create event registry entry for {feature}" +- "Discover events from design" +- "Create V3 analytics spec from Figma" +- "Annotate Figma with events" + +## What it does + +Discovers and documents events for a feature by searching across: +1. **Figma** - Design files and user flows (via Figma MCP) +2. **Event Registry** (`shared/event-registry.yaml`) - Existing documented events +3. **Codebase** - Event tracking code +4. **Product Context** - Feature requirements and user flows + +## Output Options + +1. **Event Registry YAML** - Simple event list for shared/event-registry.yaml +2. **V3 Analytics Spec** - Comprehensive specification document with flows and payloads +3. **Figma Annotations** - Event documentation written directly to Figma as comments + +## Specialized Mapping Workflows + +### When to use map-events-skill + +**Use [map-events-skill](map-events-skill/SKILL.md) when:** +- ✅ User provides a Figma URL +- ✅ Need to create V3 analytics specification document +- ✅ Want to annotate Figma file with event documentation +- ✅ Need comprehensive FE→BE event linking documentation +- ✅ Working with V3 schema (ui_interacted, flow_started, process_started, be_flow_started) + +**Workflow:** Read Figma via MCP → Parse UI elements → Map to V3 events → Write annotations back to Figma → Generate spec/YAML + +**Output:** +- V3 specification document (TEMPLATE.md format) +- Figma file with event comments (linked to node IDs) +- Event registry YAML + +### When to use general map-events workflow + +**Use this general workflow when:** +- ✅ No Figma URL provided +- ✅ Mapping from product requirements or user stories +- ✅ Searching codebase for existing events +- ✅ Creating simple event registry entries +- ✅ Working with LTX Studio production schema (not V3) + +**Continue with steps below** ↓ + +## Steps (General Workflow) + +### 1. Gather Requirements + +Ask the user: +- **Feature name**: What feature are you mapping events for? +- **Sources available**: Figma URL, PRD document, code references? +- **Scope**: Full feature or specific user flow? +- **Output format**: Event registry YAML, table, or detailed list? + +### 2. Read Shared Knowledge + +Before mapping events: +- **`shared/event-registry.yaml`** - Check if feature or similar events already exist +- **`shared/product-context.md`** - Understand LTX products and event patterns +- **`shared/bq-schema.md`** - Understand event table structure (action_name, action_category, event_name_detailed) + +Key learnings: +- Event types: usage, generation, navigation, rlhf, subscription, token +- Naming conventions: lowercase_with_underscores, verb-first (click_, generate_, view_) +- Event categories group related events + +### 3. Discover Events from Sources + +#### A. Read Figma Design (if available) +If user provides Figma URL: +- Use Figma MCP to read design file +- Identify interactive elements: buttons, dropdowns, inputs, modals +- Map user flows: entry → actions → completion +- Note feature-specific UI elements + +Look for: +- **Buttons**: "Generate", "Download", "Save", "Cancel" → click events +- **Toggles/Checkboxes**: Settings changes → toggle/enable/disable events +- **Dropdowns**: Selection changes → select events +- **Inputs**: Text/number entry → input/change events +- **Modals**: Open/close → open_modal/close_modal events +- **Page Views**: Screen navigation → view events + +#### B. Check Event Registry +```bash +grep -A 20 "{feature_name}" shared/event-registry.yaml +``` + +Look for: +- Existing events for this feature +- Similar features with event patterns to follow +- Event naming conventions used + +#### C. Search Codebase (if GitHub MCP available) +Search for: +- Event tracking calls: `track(`, `analytics.track(`, `sendEvent(` +- Feature-specific files: `{feature_name}.tsx`, `{feature_name}_screen.dart` +- Event constants: `EVENT_`, `ANALYTICS_` + +#### D. Understand User Flow +Map the complete user journey: +1. **Entry**: How does user access the feature? (click, navigation) +2. **Interaction**: What actions can user take? (generate, edit, save) +3. **Completion**: How does flow end? (download, close, share) +4. **Errors**: What can go wrong? (error states, validation failures) + +### 4. Categorize Events + +For each discovered event, determine: + +**Event Name**: Verb-first, descriptive +- Examples: `click_camera_angle`, `generate_video_t2v`, `open_style_reference_modal` + +**Event Type**: +- `usage` - User interactions (clicks, views, inputs) +- `generation` - AI generation requests (image, video, audio) +- `navigation` - Page/screen changes +- `rlhf` - User feedback (thumbs up/down, ratings) +- `subscription` - Billing events (purchase, cancel, upgrade) +- `token` - Token consumption events + +**Event Category**: Grouping for related events +- Examples: `generations`, `timeline`, `actions`, `navigation`, `subscriptions` + +**Status**: +- `active` - Event is implemented and firing +- `planned` - Event is designed but not yet implemented +- `deprecated` - Event exists but should no longer be used + +### 5. Map Event Properties + +For each event, identify key properties that should be captured: + +**Common properties** (available in all events): +- `lt_id` - User identifier +- `action_ts` - Timestamp +- `griffin_tier_name_at_action` - User tier +- `platform`, `app_version`, `country_at_reg` + +**Feature-specific properties**: +- Generation events: `model_gen_type`, `process_name`, `duration`, `resolution` +- Timeline events: `clip_id`, `timeline_id`, `clip_duration` +- Navigation events: `from_screen`, `to_screen` +- Action events: Feature-specific parameters (angle_type, style_id, etc.) + +### 6. Create Event Registry Entry + +Format as YAML following event-registry.yaml structure: + +```yaml +{feature_name}: + description: "{Brief description of what this feature does}" + events: + - name: "{event_name}" + type: "{usage|generation|navigation|rlhf|subscription|token}" + status: "{active|planned|deprecated}" + description: "{When this event fires - e.g., 'Fired when user clicks the Camera Angle button in the toolbar'}" + category: "{event_category}" + properties: + - "{property_1}" + - "{property_2}" + + - name: "{event_name_2}" + type: "{type}" + status: "{status}" + description: "{description}" + category: "{category}" + properties: + - "{property_1}" +``` + +### 7. Present for Approval + +Show the user: +- **Summary**: Number of events mapped, sources checked +- **Event List**: Table with all events and their types +- **YAML Entry**: Complete event registry entry +- **Coverage**: User flow diagram showing which events fire when +- **Recommendations**: Missing events, naming improvements, similar patterns + +Format: +``` +## Events Mapped: {feature_name} + +**Sources Checked:** +- ✓ Figma design (if applicable) +- ✓ Event registry (existing patterns) +- ✓ Codebase (if applicable) +- ✓ User flow analysis + +**Events Discovered:** + +| Event Name | Type | Category | Status | Description | +|------------|------|----------|--------|-------------| +| click_{feature}_button | usage | actions | planned | User clicks the {feature} button | +| open_{feature}_modal | usage | navigation | planned | Opens the {feature} modal dialog | +| generate_{feature}_output | generation | generations | planned | Generates output for {feature} | + +**Event Registry Entry:** +```yaml +[YAML entry here] +``` + +**User Flow:** +1. User clicks {feature} button → `click_{feature}_button` +2. Modal opens → `open_{feature}_modal` +3. User configures settings → `update_{feature}_settings` +4. User generates → `generate_{feature}_output` +5. Generation completes → `{feature}_generation_complete` +6. User downloads → `download_{feature}_output` + +**Next Steps:** +- Add this entry to `shared/event-registry.yaml` +- Share with engineering for implementation +- Validate events after deployment +``` + +## Quick Reference + +### Event Naming Conventions + +✅ **Good Examples:** +- `click_camera_angle` (verb_object) +- `generate_video_t2v` (action_type_variant) +- `open_style_reference_modal` (action_object) +- `timeline_split_clip` (feature_action_object) + +❌ **Bad Examples:** +- `CameraAngle` (not lowercase) +- `angle` (no verb) +- `user_clicked_the_camera_angle_button` (too verbose) +- `ca_btn_click` (unclear abbreviations) + +### Event Type Guide + +| Type | When to Use | Examples | +|------|-------------|----------| +| `usage` | User interactions with UI | click_, view_, open_, close_, update_ | +| `generation` | AI generation requests | generate_image, generate_video | +| `navigation` | Page/screen changes | navigate_to_, view_page_, switch_tab_ | +| `rlhf` | User feedback | thumbs_up, thumbs_down, rate_output | +| `subscription` | Billing/subscription | purchase_plan, cancel_subscription, upgrade_tier | +| `token` | Token usage | consume_tokens, exceed_quota | + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/event-registry.yaml` | Always - check existing patterns first | +| `shared/product-context.md` | Before mapping features - understand business context | +| `shared/bq-schema.md` | Before defining properties - understand table schema | + +## Rules + +### Discovery + +- **DO** check all available sources (Figma, registry, code, product requirements) +- **DO** map the complete user flow, not just happy path +- **DO** include error states and edge cases +- **DO** follow existing naming conventions in the registry +- **DO** categorize events consistently with existing patterns +- **DO NOT** invent event names without understanding the feature +- **DO NOT** skip checking existing registry - avoid duplicates + +### Documentation + +- **DO** use lowercase_with_underscores for all event names +- **DO** start event names with verbs (click_, generate_, view_, open_) +- **DO** include clear descriptions explaining when events fire +- **DO** specify event type, category, and status +- **DO** list key properties that should be captured +- **DO** follow event-registry.yaml YAML structure exactly +- **DO NOT** use abbreviations or unclear terms +- **DO NOT** create overly verbose event names + +### Event Properties + +- **DO** list feature-specific properties that are meaningful for analysis +- **DO** consider what dimensions analysts will want to segment by +- **DO** check bq-schema.md for available columns +- **DO NOT** list every possible property - focus on key dimensions +- **DO NOT** include properties that are always the same value + +## Common Patterns + +### Button Click Events +```yaml +- name: "click_{button_name}" + type: "usage" + status: "planned" + description: "Fired when user clicks the {button name} button" + category: "actions" +``` + +### Modal Events +```yaml +- name: "open_{modal_name}_modal" + type: "usage" + status: "planned" + description: "Fired when {modal name} modal opens" + category: "navigation" + +- name: "close_{modal_name}_modal" + type: "usage" + status: "planned" + description: "Fired when {modal name} modal closes" + category: "navigation" +``` + +### Generation Events +```yaml +- name: "generate_{output_type}" + type: "generation" + status: "planned" + description: "Fired when user initiates {output type} generation" + category: "generations" + properties: + - model_gen_type + - process_name + - duration + - resolution +``` + +### Setting Toggle Events +```yaml +- name: "toggle_{setting_name}" + type: "usage" + status: "planned" + description: "Fired when user toggles {setting name} on/off" + category: "actions" + properties: + - setting_value +``` + +## Anti-Patterns + +| Anti-Pattern | Why | Do this instead | +|--------------|-----|-----------------| +| Not checking existing registry | May duplicate events or miss patterns | Always `grep` event-registry.yaml first | +| Using camelCase or PascalCase | Inconsistent with existing events | Use lowercase_with_underscores | +| Generic event names like "click_button" | Not specific enough for analysis | Use descriptive names: click_camera_angle_button | +| Including every UI element | Too granular, creates noise | Focus on meaningful user actions | +| Skipping error states | Missing important failure tracking | Map error events: generation_failed, validation_error | +| No event properties listed | Analysts won't know what dimensions are available | List key properties for segmentation | diff --git a/agents/events-agent/map-events/map-events-skill/IMPLEMENTATION_RULES.md b/agents/events-agent/map-events/map-events-skill/IMPLEMENTATION_RULES.md new file mode 100644 index 0000000..95d84a6 --- /dev/null +++ b/agents/events-agent/map-events/map-events-skill/IMPLEMENTATION_RULES.md @@ -0,0 +1,276 @@ +# Analytics Events Implementation Rules + +This document defines the requirements and conventions for specifying and implementing frontend analytics events in LTX Studio using V3 schemas. For the complete event model, see the [B2B Events Glossary](resources/docs/analytics-global-web-v3/B2B_EVENTS_GLOSSARY.md). + +## Event Types + +| Event | Schema | When to Report | +|-------|--------|----------------| +| `ui_interaction` | `web_ltx_frontend_user_interaction.avsc` | Every click, type, toggle, drag/drop | +| `view_presented` | `web_ltx_frontend_view.avsc` | Modal/menu/toast opens | +| `view_dismissed` | `web_ltx_frontend_view.avsc` | Modal/menu closes | +| `flow_started` | `web_ltx_frontend_flow.avsc` | Multi-step journey begins | +| `flow_ended` | `web_ltx_frontend_flow.avsc` | Journey completes or cancels | +| `process_started` | `web_ltx_frontend_process.avsc` | Async operation begins | +| `process_ended` | `web_ltx_frontend_process.avsc` | Async operation completes | +| `be_flow_started` | `web_ltx_backend_flow.avsc` | Backend workflow begins | +| `be_flow_ended` | `web_ltx_backend_flow.avsc` | Backend workflow completes | +| `be_process_started` | `web_ltx_backend_process.avsc` | Backend step begins | +| `be_process_ended` | `web_ltx_backend_process.avsc` | Backend step completes | +| `be_token_charge_started` | `web_ltx_backend_token_charge.avsc` | Token charge initiated | +| `be_token_charge_ended` | `web_ltx_backend_token_charge.avsc` | Token charge completed | +| `project_state` | `web_ltx_frontend_project_state.avsc` | Project snapshot at trigger points | + +## Event Trigger Rules + +1. **Every user interaction** (click, typing, drag/drop) --> report `ui_interaction` +2. **View appears or disappears** --> report `view_presented` / `view_dismissed` with `source` record +3. **Multi-step journey begins or ends** --> report `flow_started` / `flow_ended` with `source` record +4. **Async operation triggered** --> report `process_started` / `process_ended` with `source_interaction_id` +5. **All events within a flow** must include the `flow_id` +6. **All interactions within a view** must include the `presentation_id` +7. **AI generation operations** trigger backend events linked via `fe_process_id` (see [Backend Events](#backend-events)) +8. **Export/share operations** should trigger `project_state` events + +## Naming Conventions + +All analytics string values use **snake_case**. See [ENUM_REGISTRY.md](.cursor/skills/analytics-enum-values/ENUM_REGISTRY.md) for the full list of existing values. + +| Field | Convention | Examples | +|-------|------------|----------| +| `flow_name` | feature_action | `style_element_creation`, `storyboard_creation` | +| `view_name` | component_name | `create_style_element_modal`, `asset_picker` | +| `ui_item_name` | action_or_element | `save_element`, `cancel_button`, `upload_file_link` | +| `ui_item_location` | container_name | `elements_panel`, `genspace_bar`, `element_card_menu` | +| `process_type` | operation | `upload`, `generate`, `delete`, `download` | +| `process_asset_type` | asset_category | `image`, `element`, `video`, `prompt` | +| `view_type` | view_category | `modal`, `popup`, `dialog`, `toast`, `page` | +| `interaction_method` | action_type | `click`, `type`, `drag_and_drop` | + +## ID Management + +For the full scoping model, see the "State IDs: Scoping Model" section in the [B2B Events Glossary](resources/docs/analytics-global-web-v3/B2B_EVENTS_GLOSSARY.md#state-ids-scoping-model). + +### ID Types and Lifecycle + +| ID | Created By | Passed To | Lifecycle | +|----|------------|-----------|-----------| +| `flow_id` | `flow_started` | All events until `flow_ended` | Entire flow duration | +| `interaction_id` | Each `ui_interaction` | `source_interaction_id` in triggered processes | Single interaction | +| `presentation_id` | `view_presented` | Matching `view_dismissed` + `ui_interaction.presentation_id` within the view | View lifetime | +| `process_id` | `process_started` | Matching `process_ended` | Process lifetime | + +### ID Generation + +- Generate UUIDs using `crypto.randomUUID()` +- `presentation_id`: Store in component state for the view lifetime. Pass to all interactions within the view. +- `flow_id`: Store in feature-level state/context for duration of flow. Pass to all events within the flow. +- Clear all flow-related state on `flow_ended` + +## Source Record + +The `source` field describes what triggered a flow or view event. + +### User-Triggered Events + +```json +{ + "source": { + "type": "user_interaction", + "name": "", + "location": "" + } +} +``` + +### Auto-Triggered Events + +For events triggered automatically (e.g., toast after save success): + +```json +{ + "source": { + "type": "auto", + "name": "", + "location": null + } +} +``` + +## Result Values + +### Flow Results + +| Value | When to Use | +|-------|-------------| +| `success` | Flow completed successfully | +| `cancelled` | User cancelled the flow | +| `failure` | Flow failed due to error | + +### Process Results + +| Value | When to Use | +|-------|-------------| +| `success` | Process completed successfully | +| `failure` | Process failed | +| `cancelled` | Process cancelled by user | +| `timeout` | Process timed out | + +## process_params / interaction_params + +Use these for feature-specific metadata not covered by dedicated schema fields. See the [B2B Events Glossary](resources/docs/analytics-global-web-v3/B2B_EVENTS_GLOSSARY.md#process-params--interaction-params) for full guidance. + +- **Use for**: `generation_type`, `shot_number`, `prompt`, `resolution`, `element_type` +- **Do NOT use if**: a dedicated field exists (`model_name`, `aspect_ratio`, `asset_id`) +- **Rules**: No PII. Values always strings. Keys use `snake_case`. Document keys in the spec. + +## Event Timing + +| Event | When to Fire | +|-------|--------------| +| `ui_interaction` | Immediately on user action | +| `flow_started` | After the triggering `ui_interaction` | +| `view_presented` | After view animation completes / is visible | +| `view_dismissed` | When view starts closing | +| `process_started` | Before async operation begins | +| `process_ended` | After async operation completes | +| `flow_ended` | Last event, after all cleanup | + +## Required Fields by Event Type + +### ui_interaction + +```typescript +{ + event_name: "ui_interaction", + interaction_id: string, // Required - UUID + ui_item_name: string, // Required + ui_item_type: string, // Required - button, menu, text_box, etc. + interaction_method: string, // Required - click, type, drag_and_drop + ui_item_location?: string, // Recommended + presentation_id?: string, // If within a view + flow_id?: string, // If within a flow + interaction_params?: Array<{key, value}>, // Optional metadata +} +``` + +### view_presented / view_dismissed + +```typescript +{ + event_name: "view_presented" | "view_dismissed", + presentation_id: string, // Required - UUID, same for presented/dismissed pair + view_name: string, // Required + view_type: string, // Required - modal, popup, dialog, toast, page + source: { type, name, location }, // Required for view_presented + source_interaction_id?: string, // Recommended - links to triggering interaction + flow_id?: string, // If within a flow +} +``` + +### flow_started / flow_ended + +```typescript +{ + event_name: "flow_started" | "flow_ended", + flow_id: string, // Required - UUID + flow_name: string, // Required + source: { type, name, location }, // Required + source_interaction_id?: string, // Recommended + result?: string, // Required for flow_ended - success, cancelled, failure +} +``` + +### process_started / process_ended + +```typescript +{ + event_name: "process_started" | "process_ended", + process_id: string, // Required - UUID, same for started/ended pair + process_type: string, // Required - upload, generate, delete, download + process_asset_type: string, // Required - image, element, video, etc. + source_interaction_id?: string, // Recommended - links to triggering interaction + flow_id?: string, // If within a flow + result?: string, // Required for process_ended - success, failure, cancelled, timeout + asset_id?: string, // On success, the created/affected asset ID + process_params?: Array<{key, value}>, // Optional metadata +} +``` + +## Backend Events + +When a feature involves AI generation, document the backend event chain. For the complete FE-BE linking model, see [BACKEND_EVENTS_WITH_FE_LINKING.md](resources/docs/analytics-global-web-v3/BACKEND_EVENTS_WITH_FE_LINKING.md). + +### FE-BE Linking Chain + +``` +ui_interaction (interaction_id) + | source_interaction_id + v +fe_process_started (process_id) + | fe_process_id = process_id + v +be_flow_started (flow_id, fe_process_id) --> be_process xN --> be_flow_ended + | | + +-- be_token_charge_started/ended | + v +fe_process_started (fetch) --> fe_process_ended (generation_id, asset_id) +``` + +### When to Include Backend Events in Specs + +- **Always**: When the feature triggers AI generation (video, image, audio, text) +- **Token charging**: When the operation consumes tokens +- **Not needed**: For pure frontend operations (upload, UI state changes) + +## Cancel/Error Handling + +### User Cancellation + +When user cancels a flow (clicks Cancel, closes modal, navigates away): + +1. Report `ui_interaction` for cancel action (if applicable) +2. Report `view_dismissed` for any open modals +3. Report `flow_ended` with `result: "cancelled"` +4. Clear all flow state + +### Process Errors + +When an async operation fails: + +1. Report `process_ended` with `result: "failure"` +2. Optionally include error details in `extra_details` +3. Do NOT end the flow - user may retry + +## Implementation Checklist + +Before finalizing the spec: + +### Event Coverage +- [ ] All entry points trigger `flow_started` +- [ ] All modals/popups have `view_presented`/`view_dismissed` pairs +- [ ] All clickable elements trigger `ui_interaction` +- [ ] All async operations have `process_started`/`process_ended` pairs +- [ ] Cancel paths trigger `flow_ended` with `result: cancelled` +- [ ] Error paths trigger `process_ended` with `result: failure` +- [ ] Backend events documented for AI generation flows +- [ ] `project_state` documented for export/share triggers + +### ID Linking +- [ ] `flow_id` generated at `flow_started` and passed to all subsequent events +- [ ] `interaction_id` generated for each `ui_interaction` +- [ ] `presentation_id` links `view_presented` to `view_dismissed` and to interactions within the view +- [ ] `process_id` links `process_started` to `process_ended` +- [ ] `source_interaction_id` links processes to triggering interactions +- [ ] `fe_process_id` links frontend processes to backend flows (for generation) + +### Field Values +- [ ] All names use snake_case +- [ ] `source` records have correct type, name, location +- [ ] Results are correct (success/cancelled/failure for flows, success/failure/cancelled/timeout for processes) +- [ ] New values checked against [ENUM_REGISTRY.md](.cursor/skills/analytics-enum-values/ENUM_REGISTRY.md) + +### State Management +- [ ] Flow state stored appropriately (module variables, context, or store) +- [ ] All state cleared on `flow_ended` +- [ ] No stale IDs from previous flows diff --git a/agents/events-agent/map-events/map-events-skill/SKILL.md b/agents/events-agent/map-events/map-events-skill/SKILL.md new file mode 100644 index 0000000..5943fa4 --- /dev/null +++ b/agents/events-agent/map-events/map-events-skill/SKILL.md @@ -0,0 +1,491 @@ +--- +name: map-figma-analytics-events +description: Map analytics events from Figma designs for new features. Creates comprehensive event specifications with flows, payloads, and implementation guides. Can annotate Figma files and generate spec documents following LTX Studio V3 schema patterns. +tags: [figma, events, mapping, annotation, mcp, v3-schema] +--- + +# Map Figma to Analytics Events + +Create comprehensive analytics event specifications from Figma designs, with support for: +- **V3 Schema Specs** - Full event specification documents following LTX Studio V3 patterns +- **Figma Annotations** - Write event documentation directly to Figma as comments +- **Event Registry** - Generate YAML entries for event-registry.yaml +- **FE↔BE Linking** - Document frontend-to-backend event correlation + +## When to use + +- "Map events from Figma design" +- "Create analytics spec for {feature}" +- "Annotate Figma with event tracking" +- "Document frontend and backend events" +- "Generate event specification document" + +## Workflow + +### Mode 1: V3 Schema Specification (Full Documentation) + +Use this for comprehensive event specs with frontend + backend patterns: + +1. **Gather context** - Read [IMPLEMENTATION_RULES.md](IMPLEMENTATION_RULES.md) for V3 event model. For deeper reference, read B2B_EVENTS_GLOSSARY.md. +2. **Analyze Figma** - Identify entry points, user flows, and UI components +3. **Map Events** - Determine which events each interaction triggers (frontend + backend) +4. **Check vocabulary** - Verify field values against ENUM_REGISTRY.md or flag new ones needed +5. **Define Fields** - Specify field values for each event +6. **Create Spec** - Generate documentation following [TEMPLATE.md](TEMPLATE.md) +7. **Save** - Save to `resources/docs/analytics-global-web-v3/_FRONTEND_FLOWS.md` + +### Mode 2: Figma Annotation + Event Registry (Quick Documentation) + +Use this for quick event mapping with Figma comments: + +1. **Gather requirements** - Figma URL, scope, annotation mode +2. **Read Figma file** - Parse frames, components, interactions via Figma MCP +3. **Identify trackable elements** - Buttons, modals, inputs, navigation +4. **Map events** - Create event specs with properties +5. **Write annotations** - Add comments to Figma linked to specific nodes +6. **Generate registry** - Create YAML entry for event-registry.yaml +7. **Present results** - Summary table + annotated Figma + YAML + +## Prerequisites + +**Figma MCP must be configured:** +```bash +claude mcp add --transport http figma https://mcp.figma.com/mcp +``` + +## Accessing Figma via MCP + +### Step 1: Extract File Key from URL + +From Figma URL like `https://www.figma.com/design/ABC123XYZ/Feature-Name`: +- File key is: `ABC123XYZ` + +### Step 2: Use MCP Tools to Read Design + +**Check if Figma MCP is available:** +``` +Use ListMcpResourcesTool with server="figma" to verify connection +``` + +**Read file structure:** +Use the Figma MCP to get file contents, which returns: +- Pages and frames +- Components and instances +- Text layers and properties +- Prototype connections +- Comments + +**Add comments to Figma:** +Use Figma MCP to post comments linked to specific nodes using node IDs from the file structure. + +### Common MCP Interaction Pattern + +1. **Get file structure** - Parse all frames, components, layers +2. **Identify interactive elements** - Look for buttons, modals, inputs +3. **Map to events** - Create event specifications +4. **Post comments** - Annotate Figma with event documentation linked to node IDs + +### Workflow with Figma MCP + +``` +Step 1: User provides Figma URL + Input: "https://www.figma.com/design/ABC123/Camera-Angle-Feature" + Extract: file_key = "ABC123" + +Step 2: Check MCP connection + ListMcpResourcesTool(server="figma") + → If not available, prompt user to configure Figma MCP + +Step 3: Read Figma file + Use Figma MCP to get file structure + → Returns JSON with pages, frames, components, layers + +Step 4: Parse interactive elements + From file structure, identify: + - Buttons (type: "INSTANCE", name contains "Button") + - Modals (type: "FRAME", name contains "Modal") + - Inputs (type: "INSTANCE", name contains "Input") + - Extract node_id for each element + +Step 5: Map events + For each interactive element: + - Create event specification + - Determine event type (ui_interacted, view_presented, etc.) + - Define properties + +Step 6: Annotate Figma (if requested) + Use Figma MCP to post comments: + - Link comment to specific node_id + - Include event documentation in comment text + - Format: "📊 EVENT: {name}\nType: {type}\nProperties: {props}" + +Step 7: Generate output + - Event registry YAML + - V3 specification document (if full mode) + - Summary report +``` + +## Identifying Events from Figma + +### Frontend Events (V3 Schema) + +| Figma Element | Event Type | What to Look For | +|---------------|------------|------------------| +| Entry buttons | `flow_started` | Buttons that start a journey, menu items opening wizards | +| Modals/Popups | `view_presented`/`view_dismissed` | Overlay screens, dialogs, dropdown menus, toasts | +| Clickable items | `ui_interacted` | Buttons, links, tabs, inputs, toggles, drag targets | +| Backend ops | `process_started`/`process_ended` | Save, upload, generate, delete buttons | + +### Backend Events (V3 Schema) + +| UI Pattern | Backend Event | What to Look For | +|------------|---------------|------------------| +| Generate button | `be_flow_started`/`be_flow_ended` | Any AI generation (video, image, text) triggers a backend flow | +| Loading/progress | `be_process_started`/`be_process_ended` | Individual backend steps within a flow | +| Token cost | `be_token_charge_started`/`be_token_charge_ended` | Operations that consume tokens | + +**FE→BE Linking Chain:** +When a feature involves AI generation, document the full correlation: +`ui_interacted` → `process_started (generate)` → `be_flow_started` (via `fe_process_id`) → `be_process_*` → `be_flow_ended` → `process_ended (fetch with generation_id)` + +See [BACKEND_EVENTS_WITH_FE_LINKING.md](resources/docs/analytics-global-web-v3/BACKEND_EVENTS_WITH_FE_LINKING.md) for the complete pattern. + +### LTX Studio Events (Legacy/Production Schema) + +For production LTX Studio events (non-V3): + +| UI Element | Event Type | Naming Pattern | Example | +|-----------|------------|----------------|---------| +| Button | click | `click_{feature}_{action}` | `click_camera_angle_apply` | +| Toggle/Switch | toggle | `toggle_{setting}` | `toggle_audio_enabled` | +| Dropdown | select | `select_{option_type}` | `select_camera_angle` | +| Modal open | open_modal | `open_{modal_name}_modal` | `open_style_reference_modal` | +| Screen/Page view | view | `view_{screen_name}` | `view_storyboard_builder` | +| Generation | generate | `generate_{asset_type}` | `generate_video` | + +### Project State Triggers + +Include `project_state` events for: +- **Export** - User exports a project (video, image, PDF) +- **Share** - User shares a project + +## Figma Element Parsing + +### Component Name Patterns + +| Pattern | Likely Event | Example | +|---------|-------------|---------| +| `*Button` | click/interaction | `GenerateButton` → `ui_interacted` | +| `*Modal` | view presented/dismissed | `StyleReferenceModal` → `view_presented` | +| `*Toggle` | interaction | `AudioToggle` → `ui_interacted` | +| `*Dropdown` | interaction | `CameraAngleDropdown` → `ui_interacted` | +| `*Input` | interaction | `PromptInput` → `ui_interacted` | + +### Frame Name Patterns + +| Pattern | Event Type | Example | +|---------|-----------|---------| +| `*Screen` | view event | `StoryboardBuilderScreen` → `view_presented` | +| `*Page` | view event | `HomePage` → `view_presented` | +| `*Modal` | view event | `CameraAngleModal` → `view_presented` | +| `*Panel` | view event | `SettingsPanel` → `view_presented` | + +### Prototype Connections + +- **Click trigger** → `ui_interacted` or flow/process start +- **Frame transition** → `view_presented` at destination +- **Overlay** → `view_presented` for modal + +## Decision Rules + +**DO track:** +- ✅ Primary actions (Generate, Apply, Save, Download) +- ✅ Navigation changes (page/screen transitions) +- ✅ Modal/dialog open/close +- ✅ Setting changes (toggles, dropdowns) +- ✅ Feature-specific interactions +- ✅ Multi-step flow entry points (`flow_started`) +- ✅ Async operations (`process_started`/`process_ended`) + +**DON'T track:** +- ❌ Hover states (too noisy) +- ❌ Every keystroke in text input (track on blur or submit) +- ❌ Tooltip opens +- ❌ Purely visual animations +- ❌ Internal state changes not visible to user + +## Output Options + +### Option A: V3 Specification Document + +Create a comprehensive spec following [TEMPLATE.md](TEMPLATE.md): + +```markdown +# Frontend Example: [Feature Name] + +## Scenarios +## Event Summary by Type +## Flow [N]: [Flow Name] + - ASCII flow chart + - Event sequence table + - Payload examples (with all required fields) +## Standalone Actions +## Cancel/Close Scenarios +## ID Map +## Observed Run Log Template +## LLM Implementation Guide +``` + +**Save to:** `resources/docs/analytics-global-web-v3/_FRONTEND_FLOWS.md` + +### Option B: Event Registry YAML + Figma Annotations + +Create YAML entry for `shared/event-registry.yaml`: + +```yaml +{feature_name}: + description: "{Feature description}" + figma_url: "{figma_file_url}" + events: + - name: "{event_name}" + type: "{usage|generation|navigation}" + status: "planned" + description: "Fired when {user action}" + category: "{category}" + figma_node_id: "{node_id}" + figma_frame: "{frame_name}" + properties: + - "{property_1}" + - "{property_2}" +``` + +**Figma Annotation Format:** +``` +📊 EVENT TRACKING + +Event: {event_name} +Type: {type} +Category: {category} + +Description: {When this fires} + +Properties: +- {property_1} +- {property_2} + +Schema: V3 / Legacy +Status: Planned +``` + +Add via Figma MCP: +``` +figma.post_comment( + file_key="{file_key}", + message="📊 Event: {event_name}\n...", + client_meta={"node_id": "{node_id}"} +) +``` + +## Event Types Reference + +### V3 Schema Events + +| Event | When to Use | Table | +|-------|-------------|-------| +| `ui_interacted` | Every click, type, toggle, drag/drop | user_interaction | +| `view_presented` / `view_dismissed` | Modal/menu/toast opens/closes | view | +| `flow_started` / `flow_ended` | Multi-step journey begins/ends | flow | +| `process_started` / `process_ended` | Async operation begins/completes | process | +| `be_flow_started` / `be_flow_ended` | Backend workflow begins/completes | backend_flow | +| `be_process_started` / `be_process_ended` | Backend step begins/completes | backend_process | +| `be_token_charge_started` / `be_token_charge_ended` | Token consumption | backend_token_charge | +| `project_state` | Project snapshot at export/share | (special) | + +### ID Linking (V3 Schema) + +| ID | Created By | Passed To | Purpose | +|----|------------|-----------|---------| +| `flow_id` | `flow_started` | All events until `flow_ended` | Groups multi-step journey | +| `interaction_id` | `ui_interacted` | `source_interaction_id` in process | Links process to trigger | +| `presentation_id` | `view_presented` | Matching `view_dismissed` + interactions | Links view lifecycle | +| `process_id` | `process_started` | Matching `process_ended` | Pairs async operation | +| `fe_process_id` | FE `process_started` | BE `be_flow_started` | FE→BE correlation | +| `generation_id` | BE flow | FE `process_ended` (fetch) | Links generation request to result | +| `task_id` | BE `be_process_started` | Retry processes | Groups retry attempts | + +For complete ID linking rules, see [IMPLEMENTATION_RULES.md](IMPLEMENTATION_RULES.md). + +## Resources + +| File | Purpose | +|------|---------| +| [IMPLEMENTATION_RULES.md](IMPLEMENTATION_RULES.md) | V3 event types, naming conventions, ID linking, required fields | +| [TEMPLATE.md](TEMPLATE.md) | Full copyable template for V3 specification documents | +| `shared/event-registry.yaml` | LTX Studio production event registry | +| `shared/bq-schema.md` | Production BigQuery table schema | +| B2B_EVENTS_GLOSSARY.md | Complete V3 event reference | +| ENUM_REGISTRY.md | Controlled vocabulary for V3 field values | +| BACKEND_EVENTS_WITH_FE_LINKING.md | Backend event model and FE-BE linking patterns | + +## Workflow Example: Camera Angle Feature + +### Input +- Figma URL: `https://figma.com/file/ABC123/Camera-Angle-Feature` +- Mode: Figma Annotation + Registry + +### Process + +1. **Read Figma** + ``` + figma.get_file(file_key="ABC123") + ``` + - Frames: Toolbar, CameraAngleModal, PreviewScreen + - Components: CameraAngleButton, AngleDropdown, ApplyButton + +2. **Map Events** + - Toolbar → `click_camera_angle` (opens modal) + - Modal appears → `open_camera_angle_modal` (view event) + - Dropdown → `select_camera_angle` (selection) + - Apply → `apply_camera_angle` (applies setting) + - Close → `close_camera_angle_modal` + +3. **Write Annotations** + ``` + figma.post_comment( + file_key="ABC123", + message="📊 EVENT: click_camera_angle\nFired when user clicks Camera Angle button", + client_meta={"node_id": "123:456"} + ) + ``` + +4. **Generate YAML** + ```yaml + camera_angle: + description: "Camera angle transformation feature" + figma_url: "https://figma.com/file/ABC123" + events: + - name: "click_camera_angle" + type: "usage" + status: "planned" + figma_node_id: "123:456" + ``` + +### Output +- ✅ 5 events mapped +- ✅ 5 comments added to Figma +- ✅ Event registry YAML generated + +## Checklist + +Before finalizing (V3 Spec Mode): + +- [ ] All entry points have `flow_started` events +- [ ] All modals/popups have `view_presented`/`view_dismissed` pairs +- [ ] All user actions have `ui_interacted` events +- [ ] All async operations have `process_started`/`process_ended` pairs +- [ ] Cancel paths include `flow_ended` with `result: cancelled` +- [ ] ID linking is documented (flow_id, interaction_id, presentation_id, process_id) +- [ ] FE→BE linking documented for generation flows (fe_process_id, generation_id) +- [ ] Payload examples include all required fields +- [ ] Naming follows snake_case conventions +- [ ] Backend events documented for any AI generation flows +- [ ] `project_state` included for export/share operations +- [ ] New field values checked against ENUM_REGISTRY.md +- [ ] Spec saved to `resources/docs/analytics-global-web-v3/` + +Before finalizing (Figma Annotation Mode): + +- [ ] All interactive elements identified +- [ ] Event names follow naming conventions +- [ ] Comments linked to correct node_id +- [ ] Properties list includes key dimensions for analysis +- [ ] YAML entry follows event-registry.yaml structure +- [ ] No duplicate events in registry + +## Rules + +### Figma Interaction + +- **DO** use Figma MCP tools (never scrape or use unofficial APIs) +- **DO** parse component and frame names for context +- **DO** follow prototype connections to understand flows +- **DO** check existing comments before adding new ones +- **DO** link comments to specific nodes using `node_id` in `client_meta` +- **DO NOT** modify the design itself +- **DO NOT** add annotations to non-interactive elements + +### Event Mapping (V3 Schema) + +- **DO** follow V3 event types (ui_interacted, view_presented, flow_started, process_started) +- **DO** document FE→BE linking for generation flows +- **DO** include backend events for AI operations +- **DO** specify all required fields per event type (see IMPLEMENTATION_RULES.md) +- **DO** use ENUM_REGISTRY.md for field values +- **DO** pair all start/end events correctly +- **DO NOT** skip ID linking documentation +- **DO NOT** invent new event types (use standard V3 events) + +### Event Mapping (LTX Production Schema) + +- **DO** follow naming conventions from event-registry.yaml +- **DO** use lowercase_with_underscores +- **DO** start event names with verbs (click_, generate_, view_) +- **DO** include modal open AND close events +- **DO NOT** track hover or focus changes +- **DO NOT** create overly verbose names + +## Anti-Patterns + +| Anti-Pattern | Why | Do this instead | +|--------------|-----|-----------------| +| Mapping every UI component | Too granular, creates noise | Only track meaningful user actions | +| Generic comment on full frame | Doesn't link to specific element | Link to specific node_id | +| Not checking existing comments | Creates duplicate annotations | Use `get_comments()` first | +| Vague descriptions | Engineers won't know what to track | Be specific about trigger and context | +| Missing required fields in V3 spec | Fails schema validation | Check IMPLEMENTATION_RULES.md | +| Wrong event type for action | Breaks V3 patterns | Use ui_interacted for clicks, not flow_started | +| Not documenting FE→BE link | Cannot trace generation flows | Always document fe_process_id and generation_id | +| Skipping backend events | Incomplete analytics for AI features | Include be_flow and be_process for generations | +| Not pairing start/end events | Breaks lifecycle tracking | Every _started needs matching _ended | + +## Common Figma UI Patterns → V3 Events + +### Button in Toolbar → Flow Entry +``` +Component: GenerateButton +Frame: MainToolbar +Events: + - ui_interacted (button click) + - flow_started (if multi-step journey) + - process_started (if async operation) +``` + +### Modal Dialog Flow +``` +Frame: StyleReferenceModal +Events: + - view_presented (modal opens) + - ui_interacted (upload button) + - process_started (upload process) + - process_ended (upload complete) + - view_dismissed (modal closes) +``` + +### Generation Flow (FE + BE) +``` +Frontend: + - ui_interacted (Generate button click) + - process_started (generate, with process_id) + - process_ended (returns immediately with request accepted) + +Backend (triggered by FE process): + - be_flow_started (fe_process_id links to FE) + - be_process_started (actual generation) + - be_token_charge_started/ended + - be_process_ended (generation complete) + - be_flow_ended (with generation_id) + +Frontend (fetch result): + - process_started (fetch, with generation_id) + - process_ended (fetch complete) +``` diff --git a/agents/events-agent/map-events/map-events-skill/TEMPLATE.md b/agents/events-agent/map-events/map-events-skill/TEMPLATE.md new file mode 100644 index 0000000..15655f0 --- /dev/null +++ b/agents/events-agent/map-events/map-events-skill/TEMPLATE.md @@ -0,0 +1,425 @@ +# Events Specification Template + +Copy this template when creating a new feature events specification. + +--- + +```markdown +# Frontend Example: [Feature Name] [Action] + +This document is a frontend-only example for [Feature Name] flows in LTX Studio. It maps user actions to the expected frontend events and shows minimal payloads with linking IDs. + +Scope: +- Frontend events only (`ui_interaction`, `view_*`, `flow_*`, `process_*`) +- Based on the V3 schemas: + - `web_ltx_frontend_user_interaction.avsc` for UI interactions + - `web_ltx_frontend_view.avsc` for view presented/dismissed + - `web_ltx_frontend_flow.avsc` for flow started/ended + - `web_ltx_frontend_process.avsc` for process started/ended +- Backend events documented where AI generation is involved (see Backend Events section) + +## Scenarios + +This document covers the following user flows: + +1. **[Flow 1 Name]** - [Brief description] +2. **[Flow 2 Name]** - [Brief description] +3. **[Standalone Actions]** - [Brief description] + +--- + +## Event Summary by Type + +| Event Type | Schema | When to Report | +| --- | --- | --- | +| `ui_interaction` | `web_ltx_frontend_user_interaction.avsc` | Every click, type, toggle, drag/drop | +| `view_presented` | `web_ltx_frontend_view.avsc` | Modals/menus/toasts open | +| `view_dismissed` | `web_ltx_frontend_view.avsc` | Modals/menus close | +| `flow_started` | `web_ltx_frontend_flow.avsc` | Begin multi-step user journey | +| `flow_ended` | `web_ltx_frontend_flow.avsc` | Complete or cancel multi-step journey | +| `process_started` | `web_ltx_frontend_process.avsc` | Async operation begins (upload, save, delete) | +| `process_ended` | `web_ltx_frontend_process.avsc` | Async operation completes (with result) | + +### Event Triggers (General Rules) + +1. **For every user interaction** (click, typing, drag/drop), report a `ui_interaction` event +2. **When a new view is presented or dismissed**, report `view_presented` / `view_dismissed` with a `source` record +3. **When a flow starts or ends**, report `flow_started` / `flow_ended` with a `source` record +4. **If an async process is triggered**, report `process_started` / `process_ended` with `source_interaction_id` linkage +5. **All events within a flow should include the `flow_id`** of the currently active flow +6. **All interactions within a view should include the `presentation_id`** of the containing view + +### Source Field + +Flow and view events use a nested `source` record to describe what triggered them: + +` ``json +{ + "source": { + "type": "user_interaction", + "name": "[ui_item_name]", + "location": "[ui_item_location]" + } +} +` `` + +| Field | Description | Examples | +| --- | --- | --- | +| `source.type` | Category of the trigger | `user_interaction` or `auto` | +| `source.name` | Specific identifier of the trigger | `[example_name]` | +| `source.location` | UI section/area where trigger occurred | `[example_location]` | + +--- + +## Flow 1: [Flow Name] + +[Brief description of what the user does in this flow.] + +### Flow Chart: [Flow Name] + +` `` +┌─────────────────────────────────────────────────────────────────────────────────────────┐ +│ [FLOW NAME] FLOW │ +│ flow_id: │ +│ flow_name: [flow_name] │ +│ │ +│ Entry Point: [Entry Point Description] │ +└─────────────────────────────────────────────────────────────────────────────────────────┘ + +╔═══════════════════════════════════════════════════════════════════════════════════════════╗ +║ STEP 1: [Step Name] ║ +╠═══════════════════════════════════════════════════════════════════════════════════════════╣ +║ ║ +║ [User Action] [Events Reported] ║ +║ ───────────── ───────────────── ║ +║ ║ +║ [Action description] ──► ui_interaction (item: [item_name]) ║ +║ │ │ ║ +║ │ ▼ ║ +║ │ flow_started ([flow_name]) ║ +║ │ │ ║ +║ ▼ ▼ ║ +║ [Next state] ──► view_presented ([view_name]) ║ +║ ║ +╚═══════════════════════════════════════════════════════════════════════════════════════════╝ + │ + ▼ +╔═══════════════════════════════════════════════════════════════════════════════════════════╗ +║ STEP 2: [Step Name] ║ +╠═══════════════════════════════════════════════════════════════════════════════════════════╣ +║ ║ +║ [Continue pattern...] ║ +║ ║ +╚═══════════════════════════════════════════════════════════════════════════════════════════╝ +` `` + +### Expected Event Sequence (Flow 1) + +| Step | User action | Event(s) | Key fields | +| --- | --- | --- | --- | +| 1 | [Action] | `ui_interaction` | `interaction_id`, `ui_item_name: [name]`, `ui_item_location: [location]` | +| 2 | Flow starts | `flow_started` | `flow_id`, `flow_name: [name]`, `source.type: user_interaction` | +| 3 | [View] opens | `view_presented` | `presentation_id`, `flow_id`, `view_name: [name]`, `view_type: modal` | +| ... | ... | ... | ... | + +### Payload Examples (Flow 1) + +#### 1) `ui_interaction` ([action description]) +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "ui_interaction", + "working_session_id": "", + "interaction_id": "", + "interaction_method": "click", + "ui_item_name": "[item_name]", + "ui_item_type": "button", + "ui_item_location": "[location]", + "presentation_id": "", + "project_id": "", + "url": "https://app.ltx.studio/projects//[path]", + "device_timestamp": 1702300400000, + "event_timestamp": 1702300400000 +} +` `` + +#### 2) `flow_started` +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "flow_started", + "working_session_id": "", + "flow_id": "", + "flow_name": "[flow_name]", + "source": { + "type": "user_interaction", + "name": "[ui_item_name]", + "location": "[ui_item_location]" + }, + "project_id": "", + "device_timestamp": 1702300401000, + "event_timestamp": 1702300401000 +} +` `` + +#### 3) `view_presented` +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "view_presented", + "working_session_id": "", + "presentation_id": "", + "source": { + "type": "user_interaction", + "name": "[ui_item_name]", + "location": "[ui_item_location]" + }, + "flow_id": "", + "view_name": "[view_name]", + "view_type": "modal", + "project_id": "", + "device_timestamp": 1702300402000, + "event_timestamp": 1702300402000 +} +` `` + +#### 4) `process_started` +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "process_started", + "working_session_id": "", + "process_id": "", + "process_type": "[type]", + "process_asset_type": "[asset_type]", + "source_interaction_id": "", + "flow_id": "", + "project_id": "", + "device_timestamp": 1702300500000, + "event_timestamp": 1702300500000 +} +` `` + +#### 5) `process_ended` +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "process_ended", + "working_session_id": "", + "process_id": "", + "process_type": "[type]", + "process_asset_type": "[asset_type]", + "source_interaction_id": "", + "flow_id": "", + "result": "success", + "asset_id": "[asset_type]/[id]", + "project_id": "", + "device_timestamp": 1702300510000, + "event_timestamp": 1702300510000 +} +` `` + +#### 6) `view_dismissed` +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "view_dismissed", + "working_session_id": "", + "presentation_id": "", + "source": { + "type": "user_interaction", + "name": "[trigger_name]", + "location": "[view_name]" + }, + "flow_id": "", + "view_name": "[view_name]", + "view_type": "modal", + "project_id": "", + "device_timestamp": 1702300511000, + "event_timestamp": 1702300511000 +} +` `` + +#### 7) `flow_ended` +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "flow_ended", + "working_session_id": "", + "flow_id": "", + "flow_name": "[flow_name]", + "source": { + "type": "auto", + "name": "[completion_trigger]", + "location": null + }, + "result": "success", + "project_id": "", + "device_timestamp": 1702300513000, + "event_timestamp": 1702300513000 +} +` `` + +--- + +## Standalone Actions (No Flow) + +These actions are simpler operations that do not require a multi-step flow wrapper. + +### [Action Name] + +[Brief description] + +#### Event Sequence + +| Step | User action | Event(s) | Key fields | +| --- | --- | --- | --- | +| 1 | [Action] | `ui_interaction` | `ui_item_name: [name]` | +| 2 | [Result] | `process_started` | `process_type: [type]` | +| 3 | [Complete] | `process_ended` | `result: success` | + +--- + +## Cancel/Close Scenarios + +### Cancel [Flow] via Cancel Button + +#### Event Sequence + +| Step | User action | Event(s) | Key fields | +| --- | --- | --- | --- | +| 1 | Click "Cancel" button | `ui_interaction` | `ui_item_name: cancel_button` | +| 2 | Modal closes | `view_dismissed` | `source.name: cancel_button` | +| 3 | Flow ends | `flow_ended` | `result: cancelled` | + +#### Payload Example (`flow_ended` cancelled) +` ``json +{ + "app_name": "LTX-Studio", + "event_name": "flow_ended", + "working_session_id": "", + "flow_id": "", + "flow_name": "[flow_name]", + "source": { + "type": "user_interaction", + "name": "cancel_button", + "location": "[modal_name]" + }, + "result": "cancelled", + "project_id": "", + "device_timestamp": 1702301102000, + "event_timestamp": 1702301102000 +} +` `` + +--- + +## ID Map + +### ID Relationships + +` `` +┌─────────────────────────────────────────────────────────────────────────────────────────┐ +│ ID LINKING DIAGRAM │ +├─────────────────────────────────────────────────────────────────────────────────────────┤ +│ │ +│ working_session_id ─────────────────────────────────────────────────────────────────► │ +│ (constant across entire user session) │ +│ │ +│ flow_id ────────────────────────────────────────────────────────────────────────────► │ +│ (links flow_started → all events within flow → flow_ended) │ +│ │ +│ presentation_id ──► links view_presented → interactions within view → view_dismissed │ +│ │ +│ interaction_id ──┐ │ +│ ├──► source_interaction_id (in process_started/ended) │ +│ │ +│ process_id ─────────► links process_started → process_ended │ +│ │ +└─────────────────────────────────────────────────────────────────────────────────────────┘ +` `` + +### Sample ID Map + +Fill in with actual values during testing: + +` `` +working_session_id: +project_id: + +# Flow IDs +flow_id_[flow1]: + +# Presentation IDs +presentation_id_[view1]: + +# Interaction IDs +interaction_id_[action1]: + +# Process IDs +process_id_[process1]: +` `` + +--- + +## Observed Run Log Template + +Use this template to capture actual events during testing. + +### [Flow Name] + +| Step | What I did in the UI | Event(s) observed | Notes | +| --- | --- | --- | --- | +| 1 | [Action] | `[events]` | [Notes] | +| 2 | [Action] | `[events]` | [Notes] | +| ... | ... | ... | ... | + +--- + +## LLM Implementation Guide + +### General Prompt Template + +` `` +I need to implement analytics events for the [Feature Name] feature. + +Context: +- We use the V3 analytics schema (web_ltx_frontend_*.avsc schemas) +- Events are sent via our analytics service at [path] + +Please implement the following events for [SPECIFIC FLOW]: +1. Review the event specification in [SPEC_FILE].md +2. Find existing analytics patterns in the codebase +3. Add the required events following established patterns +4. Ensure all IDs are properly linked + +Key requirements: +- Generate UUIDs for flow_id, interaction_id, presentation_id, process_id +- Pass flow_id to all events within a flow +- Pass presentation_id to all interactions within a view +- Link process events via source_interaction_id +- Use exact field names and values from the spec +` `` + +### Code Review Checklist + +` `` +Review the analytics implementation against this checklist: + +1. ID Generation & Linking: + - [ ] flow_id generated at flow_started and passed to all events + - [ ] interaction_id generated for each ui_interaction + - [ ] presentation_id links view_presented to view_dismissed and to interactions within the view + - [ ] process_id links process_started to process_ended + +2. Event Completeness: + - [ ] All user actions trigger ui_interaction events + - [ ] All modals/popups have view_presented/view_dismissed pairs + - [ ] All async operations have process_started/process_ended pairs + - [ ] Flows have flow_started/flow_ended pairs + +3. Cancel/Error Handling: + - [ ] Cancel actions trigger flow_ended with result: "cancelled" + - [ ] Failed processes trigger process_ended with result: "failure" +` `` +``` diff --git a/agents/events-agent/validate-events/SKILL.md b/agents/events-agent/validate-events/SKILL.md new file mode 100644 index 0000000..f43a393 --- /dev/null +++ b/agents/events-agent/validate-events/SKILL.md @@ -0,0 +1,463 @@ +--- +name: validate-events +description: Validates event tracking by checking if events are firing in production, comparing registry vs actual data, and detecting tracking issues. +tags: [events, validation, qa, production] +--- + +# Validate Events Agent + +## When to use + +- "Validate event tracking for {feature}" +- "Check if {event_name} is firing" +- "Are timeline events being tracked?" +- "Compare registry events vs production" +- "Find events not in registry" +- "Detect tracking issues" +- "QA event implementation" + +## What it does + +Validates event tracking by: +1. **Registry vs Production** - Compares documented events with actual firing events +2. **Data Quality** - Checks for NULL values, invalid data, schema issues +3. **Volume Sanity** - Validates event counts are reasonable +4. **Recency** - Checks when events last fired +5. **Coverage** - Identifies missing or undocumented events + +Outputs: Validation report with status for each event (✓ OK, ⚠️ Warning, ❌ Issue). + +## Steps + +### 1. Gather Requirements + +Ask the user: +- **Scope**: Specific feature, list of events, or full registry validation? +- **Time window**: Last 7 days, 30 days, or custom range? +- **Validation depth**: Quick check (firing yes/no) or deep validation (data quality, properties)? +- **Expected behavior**: For new features, when was deployment? What's expected volume? + +### 2. Read Shared Knowledge + +Before validating: +- **`shared/event-registry.yaml`** - Get list of events to validate +- **`shared/bq-schema.md`** - Understand table schema and columns +- **`shared/product-context.md`** - Understand expected user volumes + +Key learnings: +- Table: `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +- Event columns: `action_name` (primary), `action_category`, `event_name_detailed` +- Partition: `action_ts` (TIMESTAMP) - always filter with `date(action_ts)` +- LT team already excluded (`is_lt_team IS FALSE` applied at table level) + +### 3. Query Production Data + +#### A. Check if Events are Firing + +For specific feature: +```sql +SELECT + action_name, + action_category, + COUNT(DISTINCT lt_id) AS unique_users, + COUNT(*) AS event_count, + MIN(date(action_ts)) AS first_seen, + MAX(date(action_ts)) AS last_seen, + -- Check for NULL values in key columns + COUNTIF(action_name IS NULL) AS null_action_name, + COUNTIF(action_category IS NULL) AS null_category, + COUNTIF(lt_id IS NULL) AS null_lt_id +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + AND ( + action_name LIKE '%{feature}%' + OR action_category LIKE '%{feature}%' + ) +GROUP BY action_name, action_category +ORDER BY event_count DESC +``` + +For specific event list: +```sql +SELECT + action_name, + COUNT(DISTINCT lt_id) AS unique_users, + COUNT(*) AS event_count, + MAX(date(action_ts)) AS last_seen, + -- Recent activity (last 7 days) + COUNTIF(date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AS events_last_7d +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name IN ( + 'event_name_1', + 'event_name_2', + 'event_name_3' + ) + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) +GROUP BY action_name +ORDER BY action_name +``` + +#### B. Registry vs Production Comparison + +```sql +WITH registry_events AS ( + -- List events from registry (manual input) + SELECT event_name FROM UNNEST([ + 'click_camera_angle', + 'open_camera_angle_modal', + 'apply_camera_angle', + 'generate_video_with_angle' + ]) AS event_name +), +production_events AS ( + SELECT + action_name AS event_name, + COUNT(DISTINCT lt_id) AS users, + COUNT(*) AS events, + MAX(date(action_ts)) AS last_seen + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + AND action_name LIKE '%camera_angle%' + GROUP BY action_name +) +SELECT + COALESCE(r.event_name, p.event_name) AS event_name, + CASE + WHEN r.event_name IS NULL THEN '⚠️ NOT IN REGISTRY' + WHEN p.event_name IS NULL THEN '❌ NOT FIRING' + WHEN p.last_seen < DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) THEN '⚠️ NO RECENT ACTIVITY' + ELSE '✓ OK' + END AS status, + p.users, + p.events, + p.last_seen +FROM registry_events r +FULL OUTER JOIN production_events p ON r.event_name = p.event_name +ORDER BY + CASE + WHEN r.event_name IS NULL THEN 1 -- Not in registry first + WHEN p.event_name IS NULL THEN 2 -- Not firing second + ELSE 3 -- OK events last + END, + event_name +``` + +### 4. Validate Data Quality + +For each event, check: + +#### A. NULL Values Check +```sql +SELECT + action_name, + -- Key columns NULL rates + SAFE_DIVIDE(COUNTIF(action_category IS NULL), COUNT(*)) * 100 AS null_category_pct, + SAFE_DIVIDE(COUNTIF(lt_id IS NULL), COUNT(*)) * 100 AS null_lt_id_pct, + SAFE_DIVIDE(COUNTIF(platform IS NULL), COUNT(*)) * 100 AS null_platform_pct, + -- For generation events, check process/model fields + SAFE_DIVIDE(COUNTIF(process_name IS NULL), COUNT(*)) * 100 AS null_process_pct, + SAFE_DIVIDE(COUNTIF(model_gen_type IS NULL), COUNT(*)) * 100 AS null_model_type_pct +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = '{event_name}' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +GROUP BY action_name +``` + +**Red flags:** +- NULL `action_category` > 5% +- NULL `lt_id` > 0% (critical - breaks user analysis) +- NULL feature-specific properties > 20% + +#### B. Event Volume Sanity +```sql +SELECT + date(action_ts) AS dt, + COUNT(DISTINCT lt_id) AS unique_users, + COUNT(*) AS event_count, + SAFE_DIVIDE(COUNT(*), COUNT(DISTINCT lt_id)) AS events_per_user +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = '{event_name}' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 14 DAY) +GROUP BY dt +ORDER BY dt DESC +``` + +**Red flags:** +- Sudden drop to 0 events (tracking broke) +- Spike > 10x average (duplicate tracking) +- events_per_user > 100 (possible infinite loop) +- Consistent 0 users but events firing (lt_id not captured) + +#### C. Date Gaps Detection +```sql +WITH date_spine AS ( + SELECT dt + FROM UNNEST(GENERATE_DATE_ARRAY( + DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY), + DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) + )) AS dt +), +event_dates AS ( + SELECT + date(action_ts) AS dt, + COUNT(*) AS event_count + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE action_name = '{event_name}' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + GROUP BY dt +) +SELECT + d.dt, + COALESCE(e.event_count, 0) AS event_count, + CASE WHEN e.event_count IS NULL THEN '❌ MISSING' ELSE '✓ OK' END AS status +FROM date_spine d +LEFT JOIN event_dates e ON d.dt = e.dt +ORDER BY d.dt DESC +``` + +**Red flags:** +- Multiple consecutive days with 0 events +- Events stopped firing after specific date (deployment issue) + +#### D. Property Value Validation + +For generation events: +```sql +SELECT + action_name, + model_gen_type, + COUNT(*) AS event_count, + COUNT(DISTINCT lt_id) AS unique_users +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = 'generate_video' + AND action_category = 'generations' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +GROUP BY action_name, model_gen_type +ORDER BY event_count DESC +``` + +**Check:** +- Are expected property values present? (t2v, i2v, v2v for video generation) +- Are there unexpected/invalid values? (empty strings, "null", "undefined") +- Distribution makes sense? (not all events have the same value) + +### 5. Detect New or Undocumented Events + +Find events firing in production but not in registry: + +```sql +WITH recent AS ( + SELECT DISTINCT action_name + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +), +registry AS ( + -- Manually list events from registry + SELECT event_name FROM UNNEST([ + 'documented_event_1', + 'documented_event_2' + ]) AS event_name +) +SELECT + r.action_name AS undocumented_event, + COUNT(*) AS event_count, + COUNT(DISTINCT a.lt_id) AS unique_users, + MAX(date(a.action_ts)) AS last_seen +FROM recent r +LEFT JOIN registry reg ON r.action_name = reg.event_name +INNER JOIN `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` a + ON r.action_name = a.action_name + AND date(a.action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +WHERE reg.event_name IS NULL + AND r.action_name NOT LIKE '%test%' -- Exclude obvious test events +GROUP BY r.action_name +ORDER BY event_count DESC +LIMIT 50 +``` + +### 6. Generate Validation Report + +Format findings as: + +``` +## Event Validation Report: {feature_name} + +**Validation Date:** {date} +**Time Window:** Last {X} days +**Events Checked:** {count} + +### Summary + +- ✓ **{X} events OK** - Firing correctly with good data quality +- ⚠️ **{Y} events with warnings** - Firing but with data quality issues +- ❌ **{Z} events failing** - Not firing or critical issues + +### Detailed Results + +| Event Name | Status | Users (30d) | Events (30d) | Last Seen | Issues | +|------------|--------|-------------|--------------|-----------|--------| +| click_camera_angle | ✓ OK | 1,234 | 5,678 | 2026-03-03 | None | +| open_camera_modal | ⚠️ Warning | 456 | 890 | 2026-03-03 | 15% NULL category | +| apply_camera_angle | ✓ OK | 789 | 1,234 | 2026-03-03 | None | +| camera_angle_generation | ❌ Not Firing | 0 | 0 | Never | Event not found in production | + +### Issues Found + +#### ❌ Critical Issues (Block Release) +1. **Event: camera_angle_generation** + - **Problem:** Not firing in production + - **Impact:** Cannot track generation usage + - **Action:** Check tracking implementation + +2. **Event: click_camera_angle** + - **Problem:** 100% NULL lt_id values + - **Impact:** Cannot attribute events to users + - **Action:** Fix lt_id capture in tracking code + +#### ⚠️ Warnings (Should Fix) +1. **Event: open_camera_modal** + - **Problem:** 15% NULL action_category + - **Impact:** Some events won't appear in category-based queries + - **Action:** Ensure category is always set + +2. **Event: update_camera_settings** + - **Problem:** Low volume (3 events in 30 days) + - **Impact:** May be broken or rarely used feature + - **Action:** Validate with product team if expected + +#### ℹ️ Info (FYI) +1. **Undocumented Events:** Found 3 events firing but not in registry + - `camera_angle_preview_generated` (234 events) + - `camera_angle_modal_dismissed` (123 events) + - `camera_angle_reset_to_default` (45 events) + - **Action:** Add to event registry + +### Recommendations + +1. **Fix critical issues before release:** + - Implement tracking for `camera_angle_generation` + - Fix lt_id capture for `click_camera_angle` + +2. **Improve data quality:** + - Ensure action_category is set for all events + - Validate low-volume events with product team + +3. **Update documentation:** + - Add 3 undocumented events to registry + - Mark deprecated events in registry +``` + +### 7. Present to User + +Show: +- **Summary stats**: OK / Warnings / Failures +- **Detailed table**: All events with validation status +- **Issue breakdown**: Critical / Warnings / Info +- **Recommended actions**: Prioritized by severity + +## Quick Reference Queries + +### Event Firing Status (Quick Check) +```sql +SELECT + action_name, + COUNT(*) AS events_last_7d, + COUNT(DISTINCT lt_id) AS unique_users +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name IN ('event1', 'event2', 'event3') + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +GROUP BY action_name +``` + +### Data Quality Summary +```sql +SELECT + action_name, + COUNT(*) AS total_events, + COUNTIF(lt_id IS NULL) AS null_lt_id, + COUNTIF(action_category IS NULL) AS null_category, + SAFE_DIVIDE(COUNTIF(lt_id IS NULL), COUNT(*)) * 100 AS null_lt_id_pct +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name LIKE '%{feature}%' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +GROUP BY action_name +HAVING null_lt_id_pct > 0 OR null_category_pct > 5 +ORDER BY null_lt_id_pct DESC +``` + +### Event Recency Check +```sql +SELECT + action_name, + MAX(date(action_ts)) AS last_seen, + DATE_DIFF(CURRENT_DATE(), MAX(date(action_ts)), DAY) AS days_since_last_seen, + CASE + WHEN MAX(date(action_ts)) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) THEN '✓ RECENT' + WHEN MAX(date(action_ts)) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) THEN '⚠️ OLD' + ELSE '❌ VERY OLD' + END AS status +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name IN ('event1', 'event2', 'event3') + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) +GROUP BY action_name +ORDER BY days_since_last_seen +``` + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/event-registry.yaml` | Always - get list of events to validate | +| `shared/bq-schema.md` | Before writing SQL - understand table schema | +| `shared/metric-standards.md` | Before validating volumes - understand expected metrics | + +## Rules + +### Query Best Practices + +- **DO** always filter on `date(action_ts)` partition column +- **DO** use `action_name` as primary event identifier +- **DO** check `event_name_detailed` for generation event variants +- **DO** exclude today's incomplete data: `date(action_ts) < CURRENT_DATE()` +- **DO** use `SAFE_DIVIDE` for percentage calculations +- **DO NOT** filter `is_lt_team` (already excluded at table level) + +### Validation Standards + +- **DO** check both "event exists" AND "data quality is good" +- **DO** validate events fired in last 7 days for recency +- **DO** flag NULL lt_id as critical (breaks all user analysis) +- **DO** flag > 5% NULL in action_category as warning +- **DO** compare expected vs actual event volumes +- **DO NOT** assume low volume = broken (some features are rarely used) +- **DO NOT** validate on partial day data (exclude today) + +### Issue Severity Classification + +**❌ Critical (Block Release):** +- Event not firing at all (0 events in time window) +- 100% NULL lt_id (cannot attribute to users) +- Event stopped firing suddenly (gap > 3 days) +- Volume dropped to 0 after deployment + +**⚠️ Warning (Should Fix):** +- > 5% NULL in action_category or key properties +- Low volume but not zero (< 10 events in 7 days for feature in production) +- Events firing but with unexpected property values +- Undocumented events found in production + +**ℹ️ Info (FYI):** +- Deprecated events still firing (low volume) +- Events with expected low usage +- Minor data quality issues (< 5% NULL in non-critical fields) + +## Anti-Patterns + +| Anti-Pattern | Why | Do this instead | +|--------------|-----|-----------------| +| Only checking if event exists | Misses data quality issues | Check NULL rates, volume sanity, property values | +| Not checking date gaps | Miss tracking outages | Always check for consecutive days with 0 events | +| Validating on today's data | Incomplete data gives false negatives | Exclude today: `date(action_ts) < CURRENT_DATE()` | +| Assuming low volume = broken | Some features are rarely used | Validate with product team before flagging | +| Not comparing registry vs production | Miss undocumented events | Always FULL OUTER JOIN to find both sides | +| Flagging every NULL as critical | Not all properties are required | Prioritize: lt_id critical, category warning, others info | diff --git a/agents/events-agent/validate-events/validate-analytics-schemas/SKILL.md b/agents/events-agent/validate-events/validate-analytics-schemas/SKILL.md new file mode 100644 index 0000000..6ebef42 --- /dev/null +++ b/agents/events-agent/validate-events/validate-analytics-schemas/SKILL.md @@ -0,0 +1,901 @@ +--- +name: validate-analytics-schemas +description: Validate LTX Studio V3 analytics event implementations (frontend AND backend) against schemas using BigQuery. Run SQL tests to check required fields, event names, recommendedValues compliance, ID pairing, cross-table linking, and FE-BE correlation. Supports spec-driven validation against Event Spec Documents. Use when testing analytics implementation, validating a DPL, running EDA on event data, or validating schema compliance in integration environment. +--- + +# Analytics Schema Validation + +This skill validates analytics event implementations in BigQuery against the V3 schema definitions — both **frontend** and **backend** events. It supports two modes: + +1. **Generic validation** - Schema compliance checks across all data (required fields, event names, recommended values, ID pairing, cross-table linking, FE-BE correlation) +2. **Spec-driven validation** - Targeted checks against a specific Event Spec Document to verify a DPL implementation + +## When to Use + +- Testing new analytics implementations in integration environment (frontend or backend) +- Validating a DPL against an Event Spec Document +- Validating field values match schema `recommendedValues` +- Checking ID pairing (flow_started↔flow_ended, process_started↔process_ended, be_flow_started↔be_flow_ended, be_process_started↔be_process_ended, be_token_charge_started↔be_token_charge_ended) +- Verifying cross-table linking (source_interaction_id, flow_id, fe_process_id, generation_id) +- Validating FE→BE correlation (fe_process_id linking, generation_id matching) +- Checking generation attributes (model_name, process_intent, aspect_ratio) are populated on generation processes +- Validating process_params consistency (expected key-value sets per process_type) +- Running EDA on event data for quality assessment + +## Spec-Driven Validation Mode + +When validating a specific feature's DPL, use this workflow: + +1. **Read the Event Spec Document** from `resources/docs/analytics-global-web-v3/_FRONTEND_FLOWS.md` +2. **Extract expected events** from the spec's "Event Summary" and "Expected Event Sequence" tables (including any backend events) +3. **Generate targeted SQL** that checks for the specific events described in the spec (frontend AND backend tables) +4. **Run queries** against the DPL's integration dataset +5. **If spec includes BE events**, also validate FE→BE correlation (fe_process_id linkage, generation_id matching) +6. **Produce a validation report** using the [VALIDATION_REPORT_TEMPLATE.md](references/VALIDATION_REPORT_TEMPLATE.md) +7. **Save report** to `resources/docs/analytics-global-web-v3/_VALIDATION_REPORT.md` + +### Spec-to-Query Generation + +Given a spec's event summary, generate SQL that checks: + +**a) Expected event names appear:** +```sql +-- Check that all expected event_name values from the spec exist +SELECT event_name, COUNT(*) as cnt +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND app_version_code = '' +GROUP BY event_name +``` + +**b) Expected field values appear:** +```sql +-- Check specific field values from the spec (e.g., flow_name, ui_item_name) +SELECT flow_name, COUNT(*) as cnt +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_flow` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND app_version_code = '' + AND flow_name IN ('', '') +GROUP BY flow_name +``` + +**c) ID pairing for spec'd flows:** +```sql +-- Verify flow_started/flow_ended pairs for the specific flow_name +SELECT + COUNTIF(event_name = 'flow_started') as starts, + COUNTIF(event_name = 'flow_ended') as ends, + COUNTIF(event_name = 'flow_ended' AND result = 'success') as successes, + COUNTIF(event_name = 'flow_ended' AND result = 'cancelled') as cancellations +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_flow` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND app_version_code = '' + AND flow_name = '' +``` + +**d) End-to-end source_interaction_id linkage:** +```sql +-- Trace interaction → process chain for the spec'd feature +SELECT + i.ui_item_name, + p.process_type, + p.process_asset_type, + p.event_name as process_event, + p.result +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_user_interaction` i +JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p + ON i.interaction_id = p.source_interaction_id +WHERE i.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND i.app_version_code = '' + AND i.ui_item_name IN ('', '') +``` + +**e) Generation attributes populated on generate processes:** +```sql +-- Check that generation processes have model_name, process_intent, aspect_ratio +SELECT + process_type, + process_asset_type, + event_name, + COUNT(*) as total, + COUNTIF(model_name IS NOT NULL) as has_model_name, + COUNTIF(process_intent IS NOT NULL) as has_process_intent, + COUNTIF(aspect_ratio IS NOT NULL) as has_aspect_ratio, + ROUND(COUNTIF(model_name IS NOT NULL) * 100.0 / COUNT(*), 1) as model_name_pct, + ROUND(COUNTIF(process_intent IS NOT NULL) * 100.0 / COUNT(*), 1) as process_intent_pct, + ROUND(COUNTIF(aspect_ratio IS NOT NULL) * 100.0 / COUNT(*), 1) as aspect_ratio_pct +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND app_version_code = '' + AND process_type = 'generate' +GROUP BY process_type, process_asset_type, event_name +``` + +**f) process_params consistency for spec'd processes:** +```sql +-- Check that expected process_params keys are present for generation processes +SELECT + p.process_type, + p.process_asset_type, + pp.key, + COUNT(*) as occurrences, + COUNT(DISTINCT p.process_id) as distinct_processes +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p, + UNNEST(process_params) AS pp +WHERE p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND p.app_version_code = '' + AND p.process_type = 'generate' +GROUP BY p.process_type, p.process_asset_type, pp.key +ORDER BY p.process_type, p.process_asset_type, occurrences DESC +``` + +**g) Backend event names appear (if spec includes BE events):** +```sql +-- Check that all expected BE event_name values from the spec exist +SELECT event_name, COUNT(*) as cnt +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +GROUP BY event_name +``` + +**h) FE→BE correlation via fe_process_id:** +```sql +-- Verify BE flows are linked to FE processes via fe_process_id +SELECT + bf.flow_id, + bf.fe_process_id, + fp.process_id as matched_fe_process, + fp.process_type +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` bf +LEFT JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` fp + ON bf.fe_process_id = fp.process_id +WHERE bf.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND bf.event_name = 'be_flow_started' +``` + +**i) BE flow↔process pairing via flow_id:** +```sql +-- Verify BE processes are linked to their parent BE flow +SELECT + bf.flow_id, + bf.flow_name, + COUNTIF(bp.event_name = 'be_process_started') as process_starts, + COUNTIF(bp.event_name = 'be_process_ended') as process_ends +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` bf +LEFT JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_process` bp + ON bf.flow_id = bp.flow_id +WHERE bf.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND bf.event_name = 'be_flow_started' +GROUP BY bf.flow_id, bf.flow_name +``` + +### DPL-Specific Filtering + +When validating a DPL, always filter by: +- **Time range**: `meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL N HOUR)` (adjust based on when DPL was deployed) +- **App version**: `app_version_code = ''` (filters to the specific DPL build) + +## Environment + +``` +Project: ltx-dwh-stg-raw +Dataset: analytics_integration + +Frontend tables: + - web_ltx_frontend_flow + - web_ltx_frontend_process + - web_ltx_frontend_user_interaction + - web_ltx_frontend_view + +Backend tables: + - web_ltx_backend_flow + - web_ltx_backend_process + - web_ltx_backend_token_charge +``` + +## Quick Start + +Run validation queries using bq CLI: + +```bash +bq query --use_legacy_sql=false "YOUR_QUERY" +``` + +## Validation Categories + +### 1. Required Fields + +Check that non-nullable schema fields are never null: + +```sql +SELECT + 'table_name' as table_name, + COUNT(*) as total_rows, + COUNTIF(required_field IS NULL) as null_count +FROM `ltx-dwh-stg-raw.analytics_integration.table_name` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +**Required fields by table (frontend):** +- flow: `flow_id`, `working_session_id`, `app_name`, `event_name`, `event_timestamp`, `device_timestamp` +- process: `process_id`, `process_type`, `process_asset_type`, `working_session_id`, `app_name`, `event_name` +- user_interaction: `interaction_id`, `working_session_id`, `app_name`, `event_name` +- view: `presentation_id`, `working_session_id`, `app_name`, `event_name` + +**Required fields by table (backend):** +- backend_flow: `flow_id`, `flow_name`, `app_name`, `event_name`, `event_timestamp` +- backend_process: `process_id`, `process_type`, `endpoint`, `endpoint_type`, `app_name`, `event_name`, `event_timestamp` +- backend_token_charge: `payment_id`, `idempotency_key`, `consumable_type`, `product_id`, `tokens_charged`, `app_name`, `event_name`, `event_timestamp` + +### 2. Event Names + +Validate event_name matches schema recommendedValues: + +**Frontend tables:** + +| Table | Valid event_names | +|-------|-------------------| +| flow | `flow_started`, `flow_ended` | +| process | `process_started`, `process_ended` | +| user_interaction | `ui_interacted`, `page_view` | +| view | `view_presented`, `view_dismissed` | + +**Backend tables:** + +| Table | Valid event_names | +|-------|-------------------| +| backend_flow | `be_flow_started`, `be_flow_ended` | +| backend_process | `be_process_started`, `be_process_ended` | +| backend_token_charge | `be_token_charge_started`, `be_token_charge_ended` | + +### 3. Recommended Values + +Cross-reference with [analytics-enum-values skill](../analytics-enum-values/SKILL.md) for current enum values. + +**Frontend fields to validate:** +- `flow_name`: storyboard_creation, element_creation, project_export, onboarding +- `process_type`: upload, generate, download, export, fetch +- `process_asset_type`: image, video, audio, project, prompt +- `ui_item_type`: button, input, card, tab, menu, toggle, slider, link, icon +- `ui_item_location`: sidebar, topbar, toolbar, canvas, timeline, prompt_bar, etc. +- `view_type`: modal, popup, page, panel, toast +- `interaction_method`: click, type, drag_and_drop +- `result`: success, failure, cancelled, timeout + +**Backend fields to validate:** +- backend_flow `flow_name`: video-generation, image-generation, audio-generation +- backend_flow `asset_type`: image, video, audio, text +- backend_flow `result`: success, failure, cancelled, timeout +- backend_process `process_type`: generate, project +- backend_process `endpoint_type`: ltxv-api, internal-worker, external_vendor +- backend_process `asset_type`: image, video, audio, project, text +- backend_process `result`: success, failure, cancelled, timeout +- backend_token_charge `consumable_type`: LTXStudio_token, LTXStudio_enterprise_token, LTXStudio_veo2, LTXStudio_veo3 +- backend_token_charge `result`: success, failure, cancelled + +### 4. ID Pairing + +Verify start/end events pair correctly. + +**Frontend process pairing:** + +```sql +WITH starts AS ( + SELECT process_id, MIN(event_timestamp) as start_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` + WHERE event_name = 'process_started' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY process_id +), +ends AS ( + SELECT process_id, MAX(event_timestamp) as end_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` + WHERE event_name = 'process_ended' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY process_id +) +SELECT + COUNT(*) as total_started, + COUNTIF(e.end_time IS NOT NULL) as matched, + COUNTIF(e.end_time IS NULL) as unmatched +FROM starts s +LEFT JOIN ends e USING (process_id) +``` + +**Backend flow pairing (be_flow_started ↔ be_flow_ended):** + +```sql +WITH starts AS ( + SELECT flow_id, flow_name, MIN(event_timestamp) as start_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` + WHERE event_name = 'be_flow_started' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY flow_id, flow_name +), +ends AS ( + SELECT flow_id, result, MAX(event_timestamp) as end_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` + WHERE event_name = 'be_flow_ended' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY flow_id, result +) +SELECT + s.flow_name, + COUNT(*) as total_started, + COUNTIF(e.end_time IS NOT NULL) as matched, + COUNTIF(e.end_time IS NULL) as unmatched, + COUNTIF(e.result = 'success') as successes, + COUNTIF(e.result = 'failure') as failures +FROM starts s +LEFT JOIN ends e USING (flow_id) +GROUP BY s.flow_name +``` + +**Backend process pairing (be_process_started ↔ be_process_ended):** + +```sql +WITH starts AS ( + SELECT process_id, process_type, MIN(event_timestamp) as start_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_process` + WHERE event_name = 'be_process_started' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY process_id, process_type +), +ends AS ( + SELECT process_id, result, MAX(event_timestamp) as end_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_process` + WHERE event_name = 'be_process_ended' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY process_id, result +) +SELECT + s.process_type, + COUNT(*) as total_started, + COUNTIF(e.end_time IS NOT NULL) as matched, + COUNTIF(e.end_time IS NULL) as unmatched, + COUNTIF(e.result = 'success') as successes, + COUNTIF(e.result = 'failure') as failures +FROM starts s +LEFT JOIN ends e USING (process_id) +GROUP BY s.process_type +``` + +**Backend token charge pairing (be_token_charge_started ↔ be_token_charge_ended):** + +```sql +WITH starts AS ( + SELECT payment_id, consumable_type, MIN(event_timestamp) as start_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_token_charge` + WHERE event_name = 'be_token_charge_started' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY payment_id, consumable_type +), +ends AS ( + SELECT payment_id, result, tokens_charged, MAX(event_timestamp) as end_time + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_token_charge` + WHERE event_name = 'be_token_charge_ended' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + GROUP BY payment_id, result, tokens_charged +) +SELECT + s.consumable_type, + COUNT(*) as total_started, + COUNTIF(e.end_time IS NOT NULL) as matched, + COUNTIF(e.end_time IS NULL) as unmatched, + COUNTIF(e.result = 'success') as successes, + COUNTIF(e.result = 'failure') as failures, + SUM(e.tokens_charged) as total_tokens_charged +FROM starts s +LEFT JOIN ends e USING (payment_id) +GROUP BY s.consumable_type +``` + +### 5. Cross-Table Linking (Frontend) + +Validate `source_interaction_id` links processes to interactions: + +```sql +SELECT + COUNT(*) as total_with_source_id, + COUNTIF(i.interaction_id IS NOT NULL) as linked, + COUNTIF(i.interaction_id IS NULL) as orphaned +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p +LEFT JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_user_interaction` i + ON p.source_interaction_id = i.interaction_id +WHERE p.source_interaction_id IS NOT NULL + AND p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +### 5b. Cross-Table Linking (FE ↔ BE) + +Backend events link to frontend via specific correlation fields. See [BACKEND_EVENTS_WITH_FE_LINKING.md](resources/docs/analytics-global-web-v3/BACKEND_EVENTS_WITH_FE_LINKING.md) for the full linking model. + +**fe_process_id: FE process → BE flow** + +The `fe_process_id` on `backend_flow` references the `process_id` from `frontend_process` (the generate process that triggered the BE flow): + +```sql +SELECT + COUNT(*) as total_be_flows, + COUNTIF(bf.fe_process_id IS NOT NULL) as has_fe_link, + COUNTIF(bf.fe_process_id IS NOT NULL AND fp.process_id IS NOT NULL) as matched_to_fe, + COUNTIF(bf.fe_process_id IS NOT NULL AND fp.process_id IS NULL) as orphaned_fe_link +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` bf +LEFT JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` fp + ON bf.fe_process_id = fp.process_id AND fp.event_name = 'process_started' +WHERE bf.event_name = 'be_flow_started' + AND bf.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +**generation_id: Correlates FE generate + FE fetch + BE flow** + +`generation_id` appears in `frontend_process` (both generate ended and fetch ended) and `backend_flow`. This links the full async chain: + +```sql +-- Check generation_id appears consistently across FE and BE +SELECT + 'fe_process (generate ended)' as source, + COUNT(DISTINCT generation_id) as distinct_generation_ids +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE event_name = 'process_ended' AND generation_id IS NOT NULL + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +UNION ALL +SELECT + 'be_flow' as source, + COUNT(DISTINCT generation_id) +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` +WHERE generation_id IS NOT NULL + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +**flow_id: BE flow → BE process + BE token_charge** + +Validate that backend processes and token charges reference valid parent flows: + +```sql +-- BE processes referencing valid flows +SELECT + COUNT(*) as total_be_processes, + COUNTIF(bp.flow_id IS NOT NULL) as has_flow_id, + COUNTIF(bp.flow_id IS NOT NULL AND bf.flow_id IS NOT NULL) as matched_to_flow, + COUNTIF(bp.flow_id IS NOT NULL AND bf.flow_id IS NULL) as orphaned_flow_id +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_process` bp +LEFT JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` bf + ON bp.flow_id = bf.flow_id AND bf.event_name = 'be_flow_started' +WHERE bp.event_name = 'be_process_started' + AND bp.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +```sql +-- Token charges referencing valid flows +SELECT + COUNT(*) as total_charges, + COUNTIF(tc.flow_id IS NOT NULL) as has_flow_id, + COUNTIF(tc.flow_id IS NOT NULL AND bf.flow_id IS NOT NULL) as matched_to_flow, + COUNTIF(tc.flow_id IS NOT NULL AND bf.flow_id IS NULL) as orphaned_flow_id +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_token_charge` tc +LEFT JOIN `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` bf + ON tc.flow_id = bf.flow_id AND bf.event_name = 'be_flow_started' +WHERE tc.event_name = 'be_token_charge_started' + AND tc.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +**task_id: BE process retry linking** + +Validate that retries share the same `task_id` (multiple `process_id`s with the same `task_id`): + +```sql +SELECT + task_id, + COUNT(DISTINCT process_id) as attempts, + COUNTIF(event_name = 'be_process_ended' AND result = 'success') as successes, + COUNTIF(event_name = 'be_process_ended' AND result = 'failure') as failures +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_process` +WHERE task_id IS NOT NULL + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +GROUP BY task_id +HAVING COUNT(DISTINCT process_id) > 1 +ORDER BY attempts DESC +``` + +### 6. Presentation ID Linking + +Validate `presentation_id` links interactions to views. Each session should have multiple distinct presentation IDs, and interactions should reference their parent view's presentation_id. + +**Check presentation_id population by event type:** + +```sql +SELECT + source, + event_name, + COUNT(*) as total, + COUNTIF(presentation_id IS NOT NULL) as has_pres_id, + ROUND(COUNTIF(presentation_id IS NOT NULL) * 100.0 / COUNT(*), 1) as pct +FROM ( + SELECT 'view' as source, event_name, presentation_id + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_view` + WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + UNION ALL + SELECT 'interaction', event_name, presentation_id + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_user_interaction` + WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +) +GROUP BY source, event_name +``` + +**Check distinct presentations per session (expect 3+ for full storyboard flow):** + +```sql +SELECT + working_session_id, + COUNT(*) as total_events, + COUNT(DISTINCT presentation_id) as distinct_presentations, + COUNTIF(presentation_id IS NULL) as missing_presentation_id +FROM ( + SELECT working_session_id, presentation_id + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_view` + WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + UNION ALL + SELECT working_session_id, presentation_id + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_user_interaction` + WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +) +GROUP BY working_session_id +``` + +**Expected presentation flow for storyboard creation:** +1. Home page (entry point) +2. Script input view +3. Storyboard builder view +4. (Optional) Model selection menu +5. Storyboard editor (final result) + +### 7. Generation Attributes + +Generation processes (`process_type = 'generate'`) should carry key attributes that describe the generation request. These are critical for analytics on model usage, generation patterns, and cost attribution. + +**Required generation attributes by asset type:** + +| Asset Type | model_name | aspect_ratio | process_intent | video_duration_seconds | is_audio_enabled | +|-----------|-----------|-------------|---------------|----------------------|-----------------| +| video | REQUIRED | REQUIRED | RECOMMENDED | REQUIRED | REQUIRED | +| image | REQUIRED | REQUIRED | RECOMMENDED | n/a | n/a | +| audio | REQUIRED | n/a | RECOMMENDED | n/a | n/a | + +**Check generation attribute population rates:** + +```sql +SELECT + process_asset_type, + event_name, + COUNT(*) as total, + COUNTIF(model_name IS NOT NULL) as has_model_name, + COUNTIF(aspect_ratio IS NOT NULL) as has_aspect_ratio, + COUNTIF(process_intent IS NOT NULL) as has_process_intent, + COUNTIF(video_duration_seconds IS NOT NULL) as has_video_duration, + COUNTIF(is_audio_enabled IS NOT NULL) as has_is_audio_enabled, + ROUND(COUNTIF(model_name IS NOT NULL) * 100.0 / COUNT(*), 1) as model_name_pct, + ROUND(COUNTIF(aspect_ratio IS NOT NULL) * 100.0 / COUNT(*), 1) as aspect_ratio_pct, + ROUND(COUNTIF(process_intent IS NOT NULL) * 100.0 / COUNT(*), 1) as process_intent_pct +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND process_type = 'generate' +GROUP BY process_asset_type, event_name +ORDER BY process_asset_type, event_name +``` + +**Validate model_name values are consistent (not garbled or unexpected):** + +```sql +SELECT + model_name, + process_asset_type, + COUNT(*) as cnt +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND process_type = 'generate' + AND model_name IS NOT NULL +GROUP BY model_name, process_asset_type +ORDER BY cnt DESC +``` + +**Validate aspect_ratio values are well-formed (e.g., "16:9", "1:1", "9:16"):** + +```sql +SELECT + aspect_ratio, + COUNT(*) as cnt, + CASE + WHEN REGEXP_CONTAINS(aspect_ratio, r'^\d+:\d+$') THEN 'VALID' + ELSE 'INVALID_FORMAT' + END as format_check +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND process_type = 'generate' + AND aspect_ratio IS NOT NULL +GROUP BY aspect_ratio +ORDER BY cnt DESC +``` + +**Check process_intent distribution for generate processes:** + +```sql +SELECT + process_intent, + process_asset_type, + COUNT(*) as cnt +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND process_type = 'generate' + AND process_intent IS NOT NULL +GROUP BY process_intent, process_asset_type +ORDER BY cnt DESC +``` + +**Cross-check FE model_name against BE model_name via generation_id:** + +```sql +-- Verify model_name consistency between FE and BE for the same generation +WITH fe_gen AS ( + SELECT + gen_id, + model_name as fe_model_name, + process_asset_type, + aspect_ratio as fe_aspect_ratio + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process`, + UNNEST(generation_id) AS gen_id + WHERE event_name = 'process_started' + AND process_type = 'generate' + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +), +be_gen AS ( + SELECT + generation_id, + model_name as be_model_name, + aspect_ratio as be_aspect_ratio + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_backend_flow` + WHERE event_name = 'be_flow_started' + AND generation_id IS NOT NULL + AND meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +) +SELECT + fe.fe_model_name, + be.be_model_name, + CASE WHEN fe.fe_model_name = be.be_model_name THEN 'MATCH' ELSE 'MISMATCH' END as model_match, + fe.fe_aspect_ratio, + be.be_aspect_ratio, + CASE WHEN fe.fe_aspect_ratio = be.be_aspect_ratio THEN 'MATCH' ELSE 'MISMATCH' END as ar_match, + COUNT(*) as cnt +FROM fe_gen fe +JOIN be_gen be ON fe.gen_id = be.generation_id +GROUP BY fe_model_name, be_model_name, fe_aspect_ratio, be_aspect_ratio +ORDER BY cnt DESC +``` + +### 8. process_params Consistency + +`process_params` is an extensible `Array<{key, value}>` for event-specific metadata. Validate that generation processes include consistent, expected key-value sets and that keys follow conventions. + +**Expected process_params keys by generation type:** + +| process_type | process_asset_type | Expected Keys | Notes | +|-------------|-------------------|---------------|-------| +| generate | video | `generation_type`, `prompt`, `shot_type` | `generation_type` should be `t2v`, `i2v`, etc. | +| generate | image | `generation_type`, `element_type`, `element_name` | For element creation flows | +| generate | dubbed_video | `video_count`, `language_count`, `source_language`, `target_language` | For dubbing flows | +| export | project | `file_name` | Export filename | +| upload | video | (varies) | | +| download | * | `count` | Number of items downloaded | + +**NOTE:** `process_params` is for metadata that does NOT have a dedicated schema field. Do not duplicate values that belong in `model_name`, `aspect_ratio`, `asset_id`, or `process_intent`. + +**Audit all process_params keys in use:** + +```sql +SELECT + p.process_type, + p.process_asset_type, + pp.key, + COUNT(*) as occurrences, + COUNT(DISTINCT p.process_id) as distinct_processes, + APPROX_COUNT_DISTINCT(pp.value) as distinct_values, + ARRAY_AGG(DISTINCT pp.value LIMIT 5) as sample_values +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p, + UNNEST(process_params) AS pp +WHERE p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +GROUP BY p.process_type, p.process_asset_type, pp.key +ORDER BY p.process_type, p.process_asset_type, occurrences DESC +``` + +**Check process_params population rate for generation processes:** + +```sql +SELECT + process_type, + process_asset_type, + event_name, + COUNT(*) as total, + COUNTIF(process_params IS NOT NULL AND ARRAY_LENGTH(process_params) > 0) as has_params, + ROUND(COUNTIF(process_params IS NOT NULL AND ARRAY_LENGTH(process_params) > 0) * 100.0 / COUNT(*), 1) as params_pct, + AVG(IF(process_params IS NOT NULL, ARRAY_LENGTH(process_params), 0)) as avg_param_count +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND process_type = 'generate' +GROUP BY process_type, process_asset_type, event_name +ORDER BY process_type, process_asset_type +``` + +**Check that specific expected keys appear in generation processes:** + +```sql +-- Validate generation_type key exists in video generation process_params +SELECT + process_asset_type, + COUNT(*) as total_generate_processes, + COUNTIF( + EXISTS(SELECT 1 FROM UNNEST(process_params) pp WHERE pp.key = 'generation_type') + ) as has_generation_type, + COUNTIF( + EXISTS(SELECT 1 FROM UNNEST(process_params) pp WHERE pp.key = 'prompt') + ) as has_prompt, + COUNTIF( + EXISTS(SELECT 1 FROM UNNEST(process_params) pp WHERE pp.key = 'shot_type') + ) as has_shot_type +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND process_type = 'generate' + AND event_name = 'process_started' +GROUP BY process_asset_type +``` + +**Detect duplicate keys within a single process_params array (data quality issue):** + +```sql +SELECT + p.process_id, + p.process_type, + pp.key, + COUNT(*) as key_count +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p, + UNNEST(process_params) AS pp +WHERE p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +GROUP BY p.process_id, p.process_type, pp.key +HAVING COUNT(*) > 1 +``` + +**Detect process_params keys that duplicate dedicated schema fields (anti-pattern):** + +```sql +-- Keys like 'model_name', 'aspect_ratio', 'asset_id' should NOT appear in process_params +-- because these have dedicated schema fields +SELECT + pp.key as duplicated_key, + p.process_type, + COUNT(*) as occurrences +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p, + UNNEST(process_params) AS pp +WHERE p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND pp.key IN ('model_name', 'aspect_ratio', 'asset_id', 'process_intent', + 'video_duration_seconds', 'is_audio_enabled', 'generation_id', + 'process_type', 'process_asset_type', 'result') +GROUP BY pp.key, p.process_type +ORDER BY occurrences DESC +``` + +**Validate process_params key naming convention (must be snake_case):** + +```sql +SELECT + pp.key, + COUNT(*) as occurrences, + CASE + WHEN REGEXP_CONTAINS(pp.key, r'^[a-z][a-z0-9]*(_[a-z0-9]+)*$') THEN 'VALID_SNAKE_CASE' + ELSE 'INVALID_FORMAT' + END as format_check +FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p, + UNNEST(process_params) AS pp +WHERE p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +GROUP BY pp.key +HAVING CASE + WHEN REGEXP_CONTAINS(pp.key, r'^[a-z][a-z0-9]*(_[a-z0-9]+)*$') THEN 'VALID_SNAKE_CASE' + ELSE 'INVALID_FORMAT' +END = 'INVALID_FORMAT' +``` + +**Consistency check: same process_type+asset_type should have the same set of keys:** + +```sql +-- Detect key set drift: are different instances of the same process_type sending different keys? +WITH process_keys AS ( + SELECT + p.process_id, + p.process_type, + p.process_asset_type, + ARRAY_AGG(pp.key ORDER BY pp.key) as key_set + FROM `ltx-dwh-stg-raw.analytics_integration.web_ltx_frontend_process` p, + UNNEST(process_params) AS pp + WHERE p.meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) + AND p.process_type = 'generate' + AND p.event_name = 'process_started' + GROUP BY p.process_id, p.process_type, p.process_asset_type +) +SELECT + process_type, + process_asset_type, + TO_JSON_STRING(key_set) as key_set_json, + COUNT(*) as occurrences +FROM process_keys +GROUP BY process_type, process_asset_type, key_set_json +ORDER BY process_type, process_asset_type, occurrences DESC +``` + +## Field Population EDA + +Check which optional fields are populated: + +```sql +SELECT + COUNT(*) as total, + COUNTIF(field_name IS NOT NULL) as populated, + ROUND(COUNTIF(field_name IS NOT NULL) * 100.0 / COUNT(*), 1) as pct +FROM table +WHERE meta_received_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR) +``` + +**Key optional fields to check (frontend):** +- process: `model_name`, `aspect_ratio`, `generation_id`, `asset_id`, `source_interaction_id` +- interaction: `ui_item_location`, `interaction_params`, `presentation_id` +- view: `source`, `source_interaction_id` +- flow: `source`, `parent_flow_id` + +**Key optional fields to check (backend):** +- backend_flow: `fe_process_id`, `generation_id`, `model_name`, `aspect_ratio`, `asset_ids`, `result`, `error_code`, `total_tasks`, `lt_id` +- backend_process: `flow_id`, `generation_id`, `model_name`, `task_id`, `task_name`, `step_index`, `asset_id`, `result`, `error_code`, `lt_id` +- backend_token_charge: `flow_id`, `tokens_full_quantity`, `batch_token_balance`, `result`, `error`, `lt_id` + +## Common Issues + +| Issue | Impact | How to Detect | +|-------|--------|---------------| +| Wrong event_name | Schema violation | Check against recommendedValues | +| Custom enum value | Analytics inconsistency | Value not in recommendedValues list | +| Missing source_interaction_id | Cannot trace process to trigger | Check population rate | +| Missing presentation_id on interactions | Cannot link interactions to views | Check population rate by event type | +| Unpaired events (FE or BE) | Incomplete lifecycle tracking | Compare start vs end counts | +| Missing fe_process_id on BE flow | Cannot link BE flow to FE trigger | Check be_flow fe_process_id population | +| Missing generation_id | Cannot correlate FE generate ↔ fetch ↔ BE | Check generation_id across tables | +| Orphaned BE process (no parent flow) | Cannot group process within a flow | Check flow_id on be_process links to valid be_flow | +| Orphaned token charge (no parent flow) | Cannot attribute charge to a flow | Check flow_id on token_charge links to valid be_flow | +| Unpaired token charges | Cannot track charge lifecycle | Compare be_token_charge_started vs ended by payment_id | +| Missing lt_id on BE events | Cannot attribute backend activity to user | Check lt_id population rate | +| Missing model_name on generate process | Cannot analyze model usage | Check model_name population rate for process_type=generate | +| Missing aspect_ratio on generate process | Cannot segment by aspect ratio | Check aspect_ratio population for video/image generates | +| Missing process_intent on generate process | Cannot classify generation intent | Check process_intent population rate | +| FE/BE model_name mismatch | Inconsistent model attribution | Cross-check via generation_id linkage | +| Missing process_params on generate | Cannot track generation context | Check process_params population rate for generates | +| Duplicate keys in process_params | Data quality issue, ambiguous values | Check for key count > 1 per process | +| process_params duplicates schema fields | Anti-pattern, data duplication | Check for keys like model_name, aspect_ratio in params | +| Inconsistent process_params keys | Key set drift across same process_type | Compare key sets per process_type+asset_type | +| Non-snake_case process_params keys | Naming convention violation | Regex check on param keys | + +## Test Files + +Pre-built SQL queries are available in `tests/sql/`: + +| File | Purpose | +|------|---------| +| 01_required_fields.sql | Non-null required fields | +| 02_event_names.sql | Event name validation | +| 03_recommended_values.sql | Enum compliance | +| 04-06_*_pairing.sql | ID pairing (flow, process, view) | +| 07-09_*_linking.sql | Cross-table linking | +| 12_process_fields_detailed.sql | Process field analysis | +| 13_comprehensive_eda.sql | Full EDA for all schemas | + +## Resources + +- Frontend schema files: `resources/schemas/analytics-global-web-v3/web_ltx_frontend_*.avsc` +- Backend schema files: `resources/schemas/analytics-global-web-v3/web_ltx_backend_*.avsc` +- Backend events & FE linking doc: `resources/docs/analytics-global-web-v3/BACKEND_EVENTS_WITH_FE_LINKING.md` +- Test results: `tests/TEST_RESULTS.md` +- Enum registry: `.cursor/skills/analytics-enum-values/ENUM_REGISTRY.md`