diff --git a/book-src/src/week-2/day-08.md b/book-src/src/week-2/day-08.md index 400877a..541c510 100644 --- a/book-src/src/week-2/day-08.md +++ b/book-src/src/week-2/day-08.md @@ -1,3 +1,407 @@ # Day 08: Launch Day! The Command Center -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To establish and run the "command center," actively assessing product and system health to provide the team with the first all-clear signal. + +**Why This Matters:** Before asking "Are they using it?", you must confidently answer "Is it working?". As the analyst, you are the first line of defense, responsible for building trust in the data and the product's stability from the very first hour. + +## The Launch Day Mindset + +Launch day transforms you from a planner into a guardian. Your role is critical: + +- **System Integrity First:** No adoption analysis matters if the feature is broken or degrading the user experience +- **Trust Builder:** Your sign-off gives the team confidence to focus on growth, not firefighting +- **Early Warning System:** You detect issues before they become crises +- **Data Quality Guardian:** You ensure the instrumentation you designed is actually firing correctly + +## Task 1: Establish Monitoring Dashboards + +Create notebook `08_launch_monitoring.ipynb` with two critical monitoring systems that run every 15 minutes. + +### System Health Dashboard + +Track core infrastructure stability to detect catastrophic failures immediately. + +**Key Metrics:** + +```sql +-- Total Events Per Minute (Baseline Detection) +SELECT + DATE_TRUNC('minute', event_timestamp) AS minute_bucket, + COUNT(*) AS total_events, + COUNT(DISTINCT user_id) AS unique_users, + COUNT(DISTINCT session_id) AS unique_sessions +FROM events +WHERE event_timestamp >= NOW() - INTERVAL '1 hour' +GROUP BY minute_bucket +ORDER BY minute_bucket DESC; + +-- App Crash Event Rate (Critical Health Indicator) +SELECT + DATE_TRUNC('hour', event_timestamp) AS hour_bucket, + COUNT(*) FILTER (WHERE event_name = 'app_crash') AS crash_events, + COUNT(*) AS total_events, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE event_name = 'app_crash') / NULLIF(COUNT(*), 0), + 2 + ) AS crash_rate_pct +FROM events +WHERE event_timestamp >= NOW() - INTERVAL '24 hours' +GROUP BY hour_bucket +ORDER BY hour_bucket DESC; + +-- Server Latency Monitoring (Performance Guardian) +SELECT + DATE_TRUNC('minute', event_timestamp) AS minute_bucket, + AVG(server_latency_ms) AS avg_latency_ms, + PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY server_latency_ms) AS p50_latency_ms, + PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY server_latency_ms) AS p95_latency_ms, + PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY server_latency_ms) AS p99_latency_ms, + MAX(server_latency_ms) AS max_latency_ms +FROM events +WHERE event_timestamp >= NOW() - INTERVAL '1 hour' + AND server_latency_ms IS NOT NULL +GROUP BY minute_bucket +ORDER BY minute_bucket DESC +LIMIT 60; +``` + +### Product Adoption Dashboard + +Track the earliest signs of user engagement with the new feature. + +**Key Metrics:** + +```sql +-- Journals Icon Tap Tracking (Discovery Signal) +SELECT + DATE_TRUNC('hour', event_timestamp) AS hour_bucket, + COUNT(DISTINCT user_id) AS unique_users_tapping_icon, + COUNT(*) AS total_taps +FROM events +WHERE event_name = 'tap_journals_icon' + AND event_timestamp >= NOW() - INTERVAL '24 hours' +GROUP BY hour_bucket +ORDER BY hour_bucket DESC; + +-- First Journal Entry Creation (Conversion Signal) +WITH first_entries AS ( + SELECT + user_id, + MIN(event_timestamp) AS first_entry_time + FROM events + WHERE event_name = 'create_journal_entry' + GROUP BY user_id +) +SELECT + DATE_TRUNC('hour', first_entry_time) AS hour_bucket, + COUNT(DISTINCT user_id) AS new_journal_adopters, + COUNT(*) AS total_first_entries +FROM first_entries +WHERE first_entry_time >= NOW() - INTERVAL '24 hours' +GROUP BY hour_bucket +ORDER BY hour_bucket DESC; + +-- Cumulative Adoption Tracking +SELECT + DATE_TRUNC('hour', event_timestamp) AS hour_bucket, + COUNT(DISTINCT user_id) AS hourly_creators, + SUM(COUNT(DISTINCT user_id)) OVER (ORDER BY DATE_TRUNC('hour', event_timestamp)) AS cumulative_adopters +FROM events +WHERE event_name = 'create_journal_entry' + AND event_timestamp >= DATE_TRUNC('day', NOW()) +GROUP BY hour_bucket +ORDER BY hour_bucket; +``` + +## Task 2: Set Alert Thresholds + +Define clear, actionable thresholds for your guardrail metrics. Document these in your notebook. + +### Critical Thresholds Framework + +```python +# Define baseline metrics from pre-launch period (calculate from historical data) +BASELINE_METRICS = { + 'crash_rate_pct': 0.1, # 0.1% baseline crash rate + 'avg_latency_ms': 250, # 250ms average latency + 'app_uninstalls_per_hour': 12, # 12 uninstalls per hour on average + 'negative_reviews_per_hour': 3 # 3 negative reviews per hour +} + +# Define alert thresholds (% deviation from baseline) +ALERT_THRESHOLDS = { + 'crash_rate_pct': 0.2, # Alert if crashes exceed 0.2% (2x baseline) + 'avg_latency_ms': 375, # Alert if latency exceeds 375ms (1.5x baseline) + 'app_uninstalls_per_hour': 18, # Alert if uninstalls exceed 18/hour (1.5x baseline) + 'negative_reviews_per_hour': 6 # Alert if negative reviews exceed 6/hour (2x baseline) +} + +def check_metric_threshold(metric_name, current_value, threshold_value): + """ + Check if a metric exceeds its threshold and return alert status + """ + if current_value > threshold_value: + deviation_pct = ((current_value - threshold_value) / threshold_value) * 100 + return { + 'alert': True, + 'metric': metric_name, + 'current': current_value, + 'threshold': threshold_value, + 'deviation_pct': round(deviation_pct, 2), + 'message': f"🚨 ALERT: {metric_name} is {deviation_pct:.1f}% above threshold" + } + return {'alert': False, 'metric': metric_name} +``` + +### Guardrail Monitoring Query + +```sql +-- Comprehensive Guardrail Check +WITH hourly_metrics AS ( + SELECT + DATE_TRUNC('hour', event_timestamp) AS hour_bucket, + COUNT(*) FILTER (WHERE event_name = 'app_uninstall') AS uninstalls, + COUNT(*) FILTER (WHERE event_name = 'app_crash') AS crashes, + COUNT(*) AS total_events, + AVG(server_latency_ms) AS avg_latency + FROM events + WHERE event_timestamp >= NOW() - INTERVAL '24 hours' + GROUP BY hour_bucket +), +metrics_with_status AS ( + SELECT + hour_bucket, + uninstalls, + crashes, + total_events, + ROUND(100.0 * crashes / NULLIF(total_events, 0), 4) AS crash_rate_pct, + ROUND(avg_latency, 2) AS avg_latency_ms, + CASE + WHEN uninstalls > 18 THEN 'ALERT: High Uninstalls' + WHEN ROUND(100.0 * crashes / NULLIF(total_events, 0), 4) > 0.2 THEN 'ALERT: High Crash Rate' + WHEN avg_latency > 375 THEN 'ALERT: High Latency' + ELSE 'Normal' + END AS status + FROM hourly_metrics +) +SELECT * +FROM metrics_with_status +ORDER BY hour_bucket DESC; +``` + +## Task 3: Create the End-of-Day Sign-Off + +At the end of launch day, produce a concise summary report with visualization and structured assessment. + +### Visualization: Key Health Metrics Timeline + +```python +import pandas as pd +import matplotlib.pyplot as plt +import seaborn as sns +from datetime import datetime, timedelta + +# Sample code structure for visualization +def create_launch_day_health_chart(health_data): + """ + Create a multi-panel health metrics visualization + """ + fig, axes = plt.subplots(3, 1, figsize=(14, 10)) + fig.suptitle('Journals Feature Launch - Day 1 Health Dashboard', + fontsize=16, fontweight='bold') + + # Panel 1: Event Volume + axes[0].plot(health_data['hour'], health_data['total_events'], + marker='o', linewidth=2, color='#2E86AB') + axes[0].set_title('Total Events Per Hour', fontsize=12, fontweight='bold') + axes[0].set_ylabel('Event Count') + axes[0].grid(True, alpha=0.3) + + # Panel 2: Crash Rate + axes[1].plot(health_data['hour'], health_data['crash_rate_pct'], + marker='o', linewidth=2, color='#A23B72') + axes[1].axhline(y=0.2, color='red', linestyle='--', label='Alert Threshold') + axes[1].set_title('Crash Rate (%)', fontsize=12, fontweight='bold') + axes[1].set_ylabel('Crash Rate %') + axes[1].legend() + axes[1].grid(True, alpha=0.3) + + # Panel 3: New Adopters + axes[2].plot(health_data['hour'], health_data['new_adopters'], + marker='o', linewidth=2, color='#F18F01') + axes[2].set_title('New Journal Adopters Per Hour', fontsize=12, fontweight='bold') + axes[2].set_ylabel('New Users') + axes[2].set_xlabel('Hour of Day') + axes[2].grid(True, alpha=0.3) + + plt.tight_layout() + plt.savefig('launch_day_health_dashboard.png', dpi=300, bbox_inches='tight') + return fig +``` + +### The Three-Point Sign-Off Structure + +**End-of-Day Launch Assessment Template:** + +```markdown +# Journals Feature Launch - Day 1 Sign-Off Report +**Date:** [Launch Date] +**Analyst:** [Your Name] +**Report Time:** [Time, e.g., 11:00 PM PT] + +--- + +## Executive Summary + +### ✅ System Status: STABLE +- **Crash Rates:** Remained within acceptable thresholds throughout the day + - Average: 0.08% (baseline: 0.1%, threshold: 0.2%) + - Peak: 0.12% at 3:00 PM (within normal variance) +- **Server Latency:** Performed within expected parameters + - Average: 245ms (baseline: 250ms) + - P95: 380ms (threshold: 500ms) +- **Critical Issues:** Zero critical system failures detected + +### 📈 Initial Adoption Signal: POSITIVE +- **Total New Adopters:** 1,247 users created their first journal entry +- **Adoption Trajectory:** Steady stream of new adopters throughout the day + - Peak hour: 2:00 PM with 156 new adopters + - Minimum hour: 4:00 AM with 23 new adopters (expected low) +- **Engagement Depth:** 34% of adopters created multiple entries on Day 1 + +### 🎯 Overall Assessment: ALL CLEAR TO PROCEED +**Recommendation:** Continue with standard monitoring cadence. No immediate intervention required. Launch is stable and showing promising early adoption signals. + +--- + +## Supporting Data + +[Insert visualization here] + +### Key Metrics Summary +| Metric | Target | Actual | Status | +|--------|--------|--------|--------| +| Crash Rate | < 0.2% | 0.08% | ✅ Pass | +| Avg Latency | < 375ms | 245ms | ✅ Pass | +| New Adopters | > 800 | 1,247 | ✅ Exceeded | +| App Uninstalls | < 18/hour | 11/hour | ✅ Pass | + +### Next Steps +1. Continue hourly monitoring for next 48 hours +2. Prepare Day 2 adoption funnel analysis +3. Monitor qualitative feedback channels for user sentiment +``` + +## Best Practices for Launch Day Monitoring + +### The Command Center Checklist + +- [ ] **Pre-Launch (T-1 hour)** + - [ ] Verify all monitoring queries are running + - [ ] Test alert notification system + - [ ] Confirm baseline metrics are calculated + - [ ] Set up communication channels with engineering team + +- [ ] **Launch Hour (T-0)** + - [ ] Monitor system health dashboard every 5 minutes + - [ ] Check instrumentation is firing correctly + - [ ] Verify event data is flowing into tables + - [ ] Confirm no immediate spikes in error rates + +- [ ] **First 4 Hours (T+0 to T+4)** + - [ ] Monitor every 15 minutes + - [ ] Document any anomalies immediately + - [ ] Communicate status updates every hour + - [ ] Start tracking early adoption patterns + +- [ ] **Rest of Day (T+4 to T+24)** + - [ ] Monitor every 30 minutes + - [ ] Compile hourly summary statistics + - [ ] Begin analyzing user journey data + - [ ] Prepare end-of-day sign-off + +- [ ] **End of Day (T+24)** + - [ ] Generate health visualization + - [ ] Complete three-point assessment + - [ ] Send sign-off to stakeholders + - [ ] Plan next day's analysis priorities + +### Communication Protocol + +**When to Alert vs. When to Watch:** + +**Immediate Alert (< 5 minutes):** +- Crash rate exceeds 0.5% (5x baseline) +- Complete loss of event data +- Widespread user-facing errors +- Security breach indicators + +**Escalate Within 1 Hour:** +- Crash rate exceeds threshold (0.2%) +- Latency degradation > 50% above baseline +- Uninstall rate spike > 2x threshold +- Negative review spike + +**Monitor and Document:** +- Metrics approaching thresholds +- Unexpected usage patterns +- Minor anomalies in specific segments +- Positive surprises exceeding expectations + +## Common Launch Day Scenarios + +### Scenario 1: The Ghost Launch +**Symptom:** System is healthy, but adoption is near zero. + +**Diagnosis Approach:** +1. Verify feature is actually live (check feature flags) +2. Confirm instrumentation is working (check test events) +3. Validate user eligibility logic +4. Check if UI entry point is visible + +### Scenario 2: The Adoption Spike +**Symptom:** Adoption far exceeds forecasts. + +**What to Do:** +1. Celebrate briefly, then investigate +2. Check for bot traffic or data quality issues +3. Monitor system capacity and performance +4. Verify user segment distribution is expected +5. Document for Week 1 memo + +### Scenario 3: The Latency Creep +**Symptom:** Average latency slowly increasing throughout the day. + +**What to Do:** +1. Segment latency by geography, device, OS +2. Check database query performance +3. Alert engineering if approaching threshold +4. Document impact on user experience metrics + +## Deliverable Checklist + +- [ ] `08_launch_monitoring.ipynb` notebook created +- [ ] System health monitoring queries implemented +- [ ] Product adoption monitoring queries implemented +- [ ] Alert thresholds defined and documented +- [ ] Monitoring automation set up (15-minute intervals) +- [ ] End-of-day visualization created +- [ ] Three-point sign-off report completed +- [ ] All critical metrics tracked and documented +- [ ] Next-day priorities identified + +## Key Takeaways + +1. **Trust is earned through rigor:** Your sign-off means something only if your monitoring was comprehensive +2. **Health before growth:** Never analyze adoption before confirming system stability +3. **Document everything:** Your notes from today become the baseline for tomorrow's analysis +4. **Communicate clearly:** Engineers and executives need different levels of detail; provide both +5. **Stay calm under pressure:** Launch day is chaotic; your methodical approach brings clarity + +--- + +**Remember:** You are not just monitoring a feature launch. You are the guardian of data integrity, the early warning system for the team, and the foundation of trust that enables confident decision-making. Your work today sets the tone for the entire launch cycle. + +Launch day is your moment to demonstrate that analytics is not just about insights—it's about ensuring the business can operate with confidence. diff --git a/book-src/src/week-2/day-09.md b/book-src/src/week-2/day-09.md index fba83a9..80073a5 100644 --- a/book-src/src/week-2/day-09.md +++ b/book-src/src/week-2/day-09.md @@ -1,3 +1,600 @@ # Day 09: The Fire Drill – Precision Bug Triage -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To move from a vague bug report to a precise, actionable diagnosis that empowers the engineering team to resolve the issue quickly. + +**Why This Matters:** A great analyst is an engineer's best friend during a crisis. By isolating the "blast radius" of a bug, you save countless hours of guesswork and turn a panic-inducing problem into a solvable one. + +## The Art of Precision Triage + +When a bug report arrives, chaos wants to take over. Your role is to bring: + +- **Precision:** Isolate exactly which users are affected +- **Quantification:** Measure the severity with data +- **Context:** Provide actionable dimensions for debugging +- **Clarity:** Communicate findings in a way engineers can immediately act on + +**Bad Triage:** "Android users are crashing" +**Good Triage:** "Users on App v3.4.1 + Android OS 12 + Samsung Galaxy S21 are experiencing a 15% crash rate per session vs. 0.1% baseline" + +The difference? The good triage gives engineers a specific starting point and justifies the urgency. + +## The Scenario + +**The Report:** A Jira ticket is filed at 10:15 AM: +``` +Title: Android users reporting crashes after update +Description: Support team has received 15 complaints this morning about +app crashes on Android devices. Users report the app freezes and force +closes when they try to use the new Journals feature. + +Priority: High +Reporter: CustomerSupport_TeamLead +``` + +Your job: Turn this vague report into a precise, data-driven diagnosis. + +## Task 1: Quantify and Isolate the Issue + +### The Investigation Framework + +**Step 1: Confirm the Signal** + +First, validate that there IS a statistically meaningful spike. + +```sql +-- Crash Rate Comparison: Today vs. Last 7 Days +WITH daily_crashes AS ( + SELECT + DATE(event_timestamp) AS event_date, + COUNT(*) FILTER (WHERE event_name = 'app_crash') AS crash_count, + COUNT(DISTINCT user_id) FILTER (WHERE event_name = 'app_crash') AS users_with_crash, + COUNT(DISTINCT session_id) AS total_sessions, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT session_id), 0), + 3 + ) AS crash_rate_per_session_pct + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '7 days' + GROUP BY event_date +) +SELECT + event_date, + crash_count, + users_with_crash, + total_sessions, + crash_rate_per_session_pct, + CASE + WHEN event_date = CURRENT_DATE THEN 'TODAY' + ELSE 'BASELINE' + END AS period_label +FROM daily_crashes +ORDER BY event_date DESC; +``` + +**Step 2: Isolate by Platform** + +Narrow down which platform is actually affected. + +```sql +-- Platform-Specific Crash Analysis +WITH platform_crashes AS ( + SELECT + u.platform, + COUNT(*) FILTER (WHERE e.event_name = 'app_crash') AS crash_events, + COUNT(DISTINCT e.session_id) AS total_sessions, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE e.event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT e.session_id), 0), + 3 + ) AS crash_rate_pct + FROM events e + JOIN users u ON e.user_id = u.user_id + WHERE e.event_timestamp >= CURRENT_DATE + GROUP BY u.platform +) +SELECT + platform, + crash_events, + total_sessions, + crash_rate_pct, + CASE + WHEN crash_rate_pct > 1.0 THEN 'CRITICAL' + WHEN crash_rate_pct > 0.5 THEN 'HIGH' + WHEN crash_rate_pct > 0.2 THEN 'MODERATE' + ELSE 'NORMAL' + END AS severity_level +FROM platform_crashes +ORDER BY crash_rate_pct DESC; +``` + +### Step 3: Pinpoint the Combination + +This is the critical query that isolates the exact "blast radius." + +```sql +-- Precision Isolation: Find the Highest-Risk Segment +WITH crash_segments AS ( + SELECT + u.platform, + u.app_version, + u.os_version, + u.device_model, + COUNT(DISTINCT e.user_id) AS affected_users, + COUNT(DISTINCT e.session_id) AS total_sessions, + COUNT(*) FILTER (WHERE e.event_name = 'app_crash') AS crash_events, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE e.event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT e.session_id), 0), + 2 + ) AS crash_rate_per_session_pct + FROM events e + JOIN users u ON e.user_id = u.user_id + WHERE e.event_timestamp >= CURRENT_DATE + AND u.platform = 'Android' -- Filter to Android based on Step 2 + GROUP BY u.platform, u.app_version, u.os_version, u.device_model + HAVING COUNT(DISTINCT e.session_id) >= 10 -- Minimum sample size filter +), +ranked_segments AS ( + SELECT + *, + ROW_NUMBER() OVER (ORDER BY crash_rate_per_session_pct DESC) AS severity_rank + FROM crash_segments + WHERE crash_rate_per_session_pct > 0.5 -- Only include elevated crash rates +) +SELECT + platform, + app_version, + os_version, + device_model, + affected_users, + total_sessions, + crash_events, + crash_rate_per_session_pct, + severity_rank +FROM ranked_segments +ORDER BY crash_rate_per_session_pct DESC +LIMIT 10; +``` + +## Task 2: Calculate Severity + +Don't just count crashes—contextualize them. + +### Severity Calculation Query + +```sql +-- Comparative Severity Analysis +WITH problematic_segment AS ( + -- The specific segment identified above + SELECT + e.user_id, + e.session_id, + e.event_name + FROM events e + JOIN users u ON e.user_id = u.user_id + WHERE e.event_timestamp >= CURRENT_DATE + AND u.app_version = '3.4.1' + AND u.os_version = 'Android 12' + AND u.device_model LIKE '%Samsung Galaxy S21%' +), +baseline_segment AS ( + -- All other Android users + SELECT + e.user_id, + e.session_id, + e.event_name + FROM events e + JOIN users u ON e.user_id = u.user_id + WHERE e.event_timestamp >= CURRENT_DATE + AND u.platform = 'Android' + AND NOT ( + u.app_version = '3.4.1' + AND u.os_version = 'Android 12' + AND u.device_model LIKE '%Samsung Galaxy S21%' + ) +), +severity_comparison AS ( + SELECT + 'Affected Segment' AS segment_type, + COUNT(DISTINCT session_id) AS total_sessions, + COUNT(*) FILTER (WHERE event_name = 'app_crash') AS crash_events, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT session_id), 0), + 2 + ) AS crash_rate_pct + FROM problematic_segment + + UNION ALL + + SELECT + 'Baseline (Other Android)' AS segment_type, + COUNT(DISTINCT session_id) AS total_sessions, + COUNT(*) FILTER (WHERE event_name = 'app_crash') AS crash_events, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT session_id), 0), + 2 + ) AS crash_rate_pct + FROM baseline_segment +) +SELECT + segment_type, + total_sessions, + crash_events, + crash_rate_pct, + CASE + WHEN segment_type = 'Affected Segment' THEN + crash_rate_pct - ( + SELECT crash_rate_pct + FROM severity_comparison + WHERE segment_type = 'Baseline (Other Android)' + ) + ELSE NULL + END AS excess_crash_rate +FROM severity_comparison; +``` + +### Impact Quantification + +Calculate the business impact: + +```sql +-- Estimated Impact Scope +WITH affected_population AS ( + SELECT + COUNT(DISTINCT u.user_id) AS total_affected_users, + COUNT(DISTINCT u.user_id) * 1.0 / ( + SELECT COUNT(DISTINCT user_id) + FROM users + WHERE platform = 'Android' + ) * 100 AS pct_of_android_base + FROM users u + WHERE u.app_version = '3.4.1' + AND u.os_version = 'Android 12' + AND u.device_model LIKE '%Samsung Galaxy S21%' +) +SELECT + total_affected_users, + ROUND(pct_of_android_base, 2) AS pct_of_android_users, + -- Estimate daily impacted sessions + ROUND(total_affected_users * 3.5, 0) AS est_daily_impacted_sessions, + -- Estimate potential uninstalls (assuming 20% of crash users uninstall) + ROUND(total_affected_users * 0.20, 0) AS est_potential_uninstalls +FROM affected_population; +``` + +## Task 3: Write the Triage Report + +In notebook `09_bug_triage.ipynb`, draft a formal triage report in a markdown cell. + +### The Professional Triage Report Template + +```markdown +# 🚨 Bug Triage Report: Android Crash Spike + +**Report ID:** BTR-2024-001 +**Date:** [Current Date] +**Time:** [Current Time] +**Analyst:** [Your Name] +**Severity:** HIGH +**Status:** ISOLATED & QUANTIFIED + +--- + +## Executive Summary + +The reported Android crash spike has been **confirmed and isolated** to a specific +user segment. Immediate engineering intervention is recommended. + +--- + +## Impacted Segment + +**Precise Configuration:** +- **App Version:** 3.4.1 +- **Operating System:** Android OS 12 +- **Device Model:** Samsung Galaxy S21 (all variants) +- **Feature Context:** Crashes occur when users interact with Journals feature + +**Affected Population:** +- **Total Users:** 2,847 users +- **% of Android Base:** 3.2% +- **Est. Daily Sessions Impacted:** ~9,965 sessions + +--- + +## Severity Analysis + +| Metric | Affected Segment | Baseline (Other Android) | Delta | +|--------|------------------|-------------------------|-------| +| **Crash Rate per Session** | **15.3%** | 0.1% | **+15.2pp** | +| **Total Crash Events Today** | 487 | 143 | +240% | +| **Users Experiencing Crashes** | 1,243 (43.7% of segment) | - | - | + +**Severity Classification:** 🔴 **CRITICAL** + +The affected segment is experiencing a crash rate **153x higher** than the Android +baseline. This represents a severe user experience degradation. + +--- + +## Business Impact + +**Estimated Daily Impact:** +- **Lost Sessions:** ~1,500 sessions/day +- **Potential Uninstalls:** ~570 users at risk (assuming 20% churn rate) +- **User Sentiment:** High negative impact on segment satisfaction + +**Trend:** ⚠️ Crash rate has been increasing since 8:00 AM launch time + +--- + +## Root Cause Hypotheses + +Based on the isolated dimensions, potential causes include: + +1. **Device-Specific Rendering Issue:** Samsung One UI 4.0 (Android 12) may have + incompatibility with journal entry input field +2. **Memory Leak:** App version 3.4.1 may have memory management issue specific to + Samsung devices +3. **Library Conflict:** Third-party library incompatibility with Samsung's Android 12 + implementation + +--- + +## Recommended Actions + +**Priority 1 (Immediate - Next 2 Hours):** +- [ ] Engineering team to pull crash logs for affected segment +- [ ] QA to attempt reproduction on Samsung Galaxy S21 w/ Android 12 +- [ ] Consider feature flag kill switch for affected segment if crash rate continues + +**Priority 2 (Next 4 Hours):** +- [ ] Identify specific code path triggering crash +- [ ] Prepare hotfix build 3.4.2 with targeted fix +- [ ] Set up dedicated monitoring for this segment + +**Priority 3 (Next 24 Hours):** +- [ ] Beta test fix with sample of affected users +- [ ] Prepare communication for impacted user segment +- [ ] Conduct post-mortem on QA process gap + +--- + +## Data Query Access + +All diagnostic queries are available in notebook: `09_bug_triage.ipynb` + +**Key Queries:** +- Crash rate comparison (baseline vs. affected) +- Segment isolation by dimensions +- Impact quantification +- Hourly trend analysis + +--- + +## Communication + +**Stakeholders Notified:** +- [x] Engineering Lead (via Slack @eng-team) +- [x] Product Manager (via email) +- [ ] Customer Support (pending - will send template) +- [ ] Executive Team (on standby pending engineering assessment) + +**Next Update:** In 2 hours or upon significant development + +--- + +## Appendix: Technical Details + +**Query Execution Time:** 3.2 seconds +**Data Freshness:** Real-time (< 5 min lag) +**Sample Size:** 10,234 total sessions analyzed +**Statistical Confidence:** High (n > 100 for all segments) + +**Analyst Notes:** +The precision of this isolation (3 specific dimensions) and the magnitude of the +difference (153x baseline) provides high confidence in the diagnosis. The engineering +team can focus debugging efforts on this specific configuration, significantly +reducing time-to-resolution. +``` + +### Alternative: Slack Communication Format + +For immediate team notification: + +``` +📊 **Bug Triage: Android Crash ISOLATED** + +@eng-team - Confirmed crash spike is isolated to specific segment: + +**🎯 Affected Configuration:** +• App Version: 3.4.1 +• OS: Android 12 +• Device: Samsung Galaxy S21 + +**📈 Severity:** +• Crash Rate: 15.3% (vs 0.1% baseline) - 153x higher +• Affected Users: ~2,850 users (3.2% of Android base) +• Status: CRITICAL + +**💡 Impact:** +• 487 crashes today (vs ~3 expected) +• ~1,500 lost sessions/day +• ~570 users at uninstall risk + +**🔧 Recommendation:** +High-priority ticket for Android team. Diagnostic queries available in +`09_bug_triage.ipynb`. Standing by for engineering questions. + +Full triage report: [Link to report] +``` + +## Best Practices for Bug Triage + +### The RAPID Framework + +**R - Reproduce the Pattern** +- Use data to identify the pattern, even if you can't reproduce manually +- Look for consistency across multiple users +- Document the frequency and timing + +**A - Assess the Scope** +- How many users are affected? +- What percentage of the user base? +- Is it growing or stable? + +**P - Pinpoint the Dimensions** +- Device model, OS version, app version +- Geographic location, network type +- User segment, feature flags, A/B test groups + +**I - Isolate the Severity** +- Compare to baseline metrics +- Calculate the excess rate +- Estimate business impact + +**D - Document for Action** +- Write clear, concise findings +- Provide specific debugging starting points +- Include all relevant queries and data + +### Common Triage Mistakes to Avoid + +❌ **"Android users are crashing"** → Too vague +✅ **"Android 12 + Samsung S21 + App v3.4.1 users are crashing at 15.3% rate"** + +❌ **"We have 487 crashes"** → No context +✅ **"487 crashes vs. 3 expected (baseline rate) = 153x increase"** + +❌ **"This is a big problem"** → Subjective +✅ **"Affecting 3.2% of Android base, ~570 users at churn risk"** + +❌ **Reporting raw counts only** → Doesn't account for traffic changes +✅ **Use rates per session/user for true comparison** + +### Advanced Triage Techniques + +#### Time-Series Anomaly Detection + +```sql +-- Detect When the Issue Started +WITH hourly_crash_rates AS ( + SELECT + DATE_TRUNC('hour', e.event_timestamp) AS hour_bucket, + COUNT(*) FILTER (WHERE e.event_name = 'app_crash' AND u.app_version = '3.4.1') AS crashes, + COUNT(DISTINCT e.session_id) AS sessions, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE e.event_name = 'app_crash' AND u.app_version = '3.4.1') / + NULLIF(COUNT(DISTINCT e.session_id), 0), + 2 + ) AS crash_rate_pct + FROM events e + JOIN users u ON e.user_id = u.user_id + WHERE e.event_timestamp >= CURRENT_DATE - INTERVAL '7 days' + AND u.platform = 'Android' + GROUP BY hour_bucket +) +SELECT + hour_bucket, + crashes, + sessions, + crash_rate_pct, + AVG(crash_rate_pct) OVER ( + ORDER BY hour_bucket + ROWS BETWEEN 6 PRECEDING AND CURRENT ROW + ) AS moving_avg_7h, + CASE + WHEN crash_rate_pct > 3 * AVG(crash_rate_pct) OVER ( + ORDER BY hour_bucket + ROWS BETWEEN 6 PRECEDING AND CURRENT ROW + ) THEN 'ANOMALY' + ELSE 'NORMAL' + END AS anomaly_flag +FROM hourly_crash_rates +ORDER BY hour_bucket DESC; +``` + +#### Cross-Dimensional Analysis + +```sql +-- Find All Dimensions with Elevated Crash Rates +SELECT + 'App Version' AS dimension_type, + u.app_version AS dimension_value, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE e.event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT e.session_id), 0), + 2 + ) AS crash_rate_pct +FROM events e +JOIN users u ON e.user_id = u.user_id +WHERE e.event_timestamp >= CURRENT_DATE + AND u.platform = 'Android' +GROUP BY u.app_version +HAVING COUNT(DISTINCT e.session_id) >= 50 + +UNION ALL + +SELECT + 'OS Version' AS dimension_type, + u.os_version AS dimension_value, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE e.event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT e.session_id), 0), + 2 + ) AS crash_rate_pct +FROM events e +JOIN users u ON e.user_id = u.user_id +WHERE e.event_timestamp >= CURRENT_DATE + AND u.platform = 'Android' +GROUP BY u.os_version +HAVING COUNT(DISTINCT e.session_id) >= 50 + +UNION ALL + +SELECT + 'Device Model' AS dimension_type, + u.device_model AS dimension_value, + ROUND( + 100.0 * COUNT(*) FILTER (WHERE e.event_name = 'app_crash') / + NULLIF(COUNT(DISTINCT e.session_id), 0), + 2 + ) AS crash_rate_pct +FROM events e +JOIN users u ON e.user_id = u.user_id +WHERE e.event_timestamp >= CURRENT_DATE + AND u.platform = 'Android' +GROUP BY u.device_model +HAVING COUNT(DISTINCT e.session_id) >= 50 + +ORDER BY crash_rate_pct DESC; +``` + +## Deliverable Checklist + +- [ ] `09_bug_triage.ipynb` notebook created +- [ ] Crash spike confirmed with statistical evidence +- [ ] Platform isolation completed +- [ ] Precise segment dimensions identified (app version, OS, device) +- [ ] Severity calculated with baseline comparison +- [ ] Business impact quantified +- [ ] Formal triage report completed with all sections +- [ ] Engineering team notified via appropriate channel +- [ ] Follow-up cadence established + +## Key Takeaways + +1. **Precision saves time:** Vague reports lead to vague fixes. Specific isolation accelerates resolution +2. **Context matters:** Raw counts mean nothing without baselines and rates +3. **Think like an engineer:** Provide debugging starting points, not just problem descriptions +4. **Quantify impact:** Business metrics (users at risk, lost sessions) justify urgency +5. **Document everything:** Your triage report becomes the reference for the post-mortem + +--- + +**Remember:** In a crisis, the team looks to you for clarity. Your ability to quickly isolate a problem and communicate it with precision can mean the difference between a 2-hour fix and a 2-day debugging nightmare. Be the analyst who turns chaos into actionable intelligence. diff --git a/book-src/src/week-2/day-10.md b/book-src/src/week-2/day-10.md index e964f84..76280ee 100644 --- a/book-src/src/week-2/day-10.md +++ b/book-src/src/week-2/day-10.md @@ -1,3 +1,591 @@ # Day 10: The Adoption Funnel – Diagnosing User Friction -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To visualize the user journey into the feature and pinpoint the exact step where most users are dropping off. + +**Why This Matters:** A feature's failure is often not due to a lack of value, but to friction. The funnel is your x-ray for seeing exactly where that friction occurs in the user experience. + +## The Funnel Mindset + +Every feature has a journey. Users don't instantly adopt; they progress through stages: + +1. **Awareness:** They encounter the feature's entry point +2. **Discovery:** They interact with it +3. **Understanding:** They comprehend its value +4. **Action:** They complete a meaningful interaction +5. **Return:** They come back (the ultimate validation) + +Your job is to: +- **Identify where users leak** from this journey +- **Quantify the magnitude** of each drop-off +- **Form hypotheses** about why users leave at each step +- **Recommend interventions** based on data, not hunches + +## The Critical Constraint: Session-Based Funnel + +**Why session-based matters:** +- Users who complete the funnel in one session are experiencing your optimal happy path +- Multi-session funnels can hide critical friction points +- Session completion correlates strongly with feature retention + +**The Rule:** A user must complete all funnel steps **within a single session** to count as a successful conversion. + +## Task 1: Define a Time-Bound Funnel + +### The Journals Feature Funnel + +``` +Step 1: app_open + ↓ +Step 2: view_main_feed + ↓ +Step 3: tap_journals_icon + ↓ +Step 4: create_first_journal_entry +``` + +Each step must occur within the same `session_id` and be sequenced in chronological order. + +## Task 2: Write the Funnel Query + +Create notebook `10_adoption_funnel.ipynb` with the robust funnel analysis. + +### Approach 1: CTE-Based Funnel (Recommended) + +```sql +-- Session-Based Adoption Funnel with Sequential Step Validation +WITH session_events AS ( + -- Get all relevant events with session context + SELECT + user_id, + session_id, + event_name, + event_timestamp, + ROW_NUMBER() OVER ( + PARTITION BY session_id + ORDER BY event_timestamp + ) AS event_sequence + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '7 days' -- Launch week + AND event_name IN ( + 'app_open', + 'view_main_feed', + 'tap_journals_icon', + 'create_journal_entry' + ) +), +step_1_app_open AS ( + SELECT DISTINCT + session_id, + user_id + FROM session_events + WHERE event_name = 'app_open' +), +step_2_view_feed AS ( + SELECT DISTINCT + se.session_id, + se.user_id + FROM session_events se + INNER JOIN step_1_app_open s1 ON se.session_id = s1.session_id + WHERE se.event_name = 'view_main_feed' + AND se.event_timestamp > ( + SELECT MIN(event_timestamp) + FROM session_events + WHERE session_id = se.session_id + AND event_name = 'app_open' + ) +), +step_3_tap_icon AS ( + SELECT DISTINCT + se.session_id, + se.user_id + FROM session_events se + INNER JOIN step_2_view_feed s2 ON se.session_id = s2.session_id + WHERE se.event_name = 'tap_journals_icon' + AND se.event_timestamp > ( + SELECT MIN(event_timestamp) + FROM session_events + WHERE session_id = se.session_id + AND event_name = 'view_main_feed' + ) +), +step_4_create_entry AS ( + SELECT DISTINCT + se.session_id, + se.user_id + FROM session_events se + INNER JOIN step_3_tap_icon s3 ON se.session_id = s3.session_id + WHERE se.event_name = 'create_journal_entry' + AND se.event_timestamp > ( + SELECT MIN(event_timestamp) + FROM session_events + WHERE session_id = se.session_id + AND event_name = 'tap_journals_icon' + ) +), +funnel_summary AS ( + SELECT + 'Step 1: App Open' AS funnel_step, + 1 AS step_number, + COUNT(DISTINCT user_id) AS users, + COUNT(DISTINCT session_id) AS sessions + FROM step_1_app_open + + UNION ALL + + SELECT + 'Step 2: View Main Feed' AS funnel_step, + 2 AS step_number, + COUNT(DISTINCT user_id) AS users, + COUNT(DISTINCT session_id) AS sessions + FROM step_2_view_feed + + UNION ALL + + SELECT + 'Step 3: Tap Journals Icon' AS funnel_step, + 3 AS step_number, + COUNT(DISTINCT user_id) AS users, + COUNT(DISTINCT session_id) AS sessions + FROM step_3_tap_icon + + UNION ALL + + SELECT + 'Step 4: Create First Entry' AS funnel_step, + 4 AS step_number, + COUNT(DISTINCT user_id) AS users, + COUNT(DISTINCT session_id) AS sessions + FROM step_4_create_entry +) +SELECT + funnel_step, + step_number, + users, + sessions, + ROUND(100.0 * users / FIRST_VALUE(users) OVER (ORDER BY step_number), 2) AS pct_of_step_1, + ROUND( + 100.0 * users / LAG(users) OVER (ORDER BY step_number), + 2 + ) AS step_conversion_rate, + users - LAG(users) OVER (ORDER BY step_number) AS user_dropoff, + ROUND( + 100.0 * (users - LAG(users) OVER (ORDER BY step_number)) / + NULLIF(LAG(users) OVER (ORDER BY step_number), 0), + 2 + ) AS step_dropoff_rate +FROM funnel_summary +ORDER BY step_number; +``` + +### Approach 2: Window Function Funnel (Alternative) + +```sql +-- Alternative: Using Window Functions for Funnel Analysis +WITH user_session_steps AS ( + SELECT + user_id, + session_id, + MAX(CASE WHEN event_name = 'app_open' THEN 1 ELSE 0 END) AS completed_step_1, + MAX(CASE WHEN event_name = 'view_main_feed' THEN 1 ELSE 0 END) AS completed_step_2, + MAX(CASE WHEN event_name = 'tap_journals_icon' THEN 1 ELSE 0 END) AS completed_step_3, + MAX(CASE WHEN event_name = 'create_journal_entry' THEN 1 ELSE 0 END) AS completed_step_4, + -- Ensure proper sequence + MIN(CASE WHEN event_name = 'app_open' THEN event_timestamp END) AS step_1_time, + MIN(CASE WHEN event_name = 'view_main_feed' THEN event_timestamp END) AS step_2_time, + MIN(CASE WHEN event_name = 'tap_journals_icon' THEN event_timestamp END) AS step_3_time, + MIN(CASE WHEN event_name = 'create_journal_entry' THEN event_timestamp END) AS step_4_time + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '7 days' + AND event_name IN ('app_open', 'view_main_feed', 'tap_journals_icon', 'create_journal_entry') + GROUP BY user_id, session_id +), +validated_funnel AS ( + SELECT + user_id, + session_id, + completed_step_1, + CASE + WHEN completed_step_2 = 1 AND step_2_time > step_1_time THEN 1 + ELSE 0 + END AS completed_step_2, + CASE + WHEN completed_step_3 = 1 AND step_3_time > step_2_time THEN 1 + ELSE 0 + END AS completed_step_3, + CASE + WHEN completed_step_4 = 1 AND step_4_time > step_3_time THEN 1 + ELSE 0 + END AS completed_step_4 + FROM user_session_steps + WHERE completed_step_1 = 1 -- Must have started the funnel +) +SELECT + 'Step 1: App Open' AS step, + SUM(completed_step_1) AS users, + ROUND(100.0 * SUM(completed_step_1) / SUM(completed_step_1), 2) AS conversion_rate +FROM validated_funnel +UNION ALL +SELECT + 'Step 2: View Main Feed', + SUM(completed_step_2), + ROUND(100.0 * SUM(completed_step_2) / SUM(completed_step_1), 2) +FROM validated_funnel +UNION ALL +SELECT + 'Step 3: Tap Journals Icon', + SUM(completed_step_3), + ROUND(100.0 * SUM(completed_step_3) / SUM(completed_step_1), 2) +FROM validated_funnel +UNION ALL +SELECT + 'Step 4: Create First Entry', + SUM(completed_step_4), + ROUND(100.0 * SUM(completed_step_4) / SUM(completed_step_1), 2) +FROM validated_funnel; +``` + +## Task 3: Visualize and Annotate + +### Creating the Funnel Chart + +```python +import pandas as pd +import matplotlib.pyplot as plt +import numpy as np + +def create_funnel_visualization(funnel_data): + """ + Create a professional funnel chart with annotations + + Parameters: + funnel_data: DataFrame with columns ['funnel_step', 'users', 'step_conversion_rate'] + """ + fig, ax = plt.subplots(figsize=(12, 8)) + + # Extract data + steps = funnel_data['funnel_step'].values + users = funnel_data['users'].values + conversion_rates = funnel_data['step_conversion_rate'].values + + # Calculate funnel widths (normalized) + max_users = users[0] + widths = users / max_users + + # Create funnel shape + y_positions = np.arange(len(steps)) + colors = plt.cm.Blues(np.linspace(0.4, 0.8, len(steps))) + + # Draw funnel bars + for i, (step, width, user_count, conv_rate) in enumerate(zip(steps, widths, users, conversion_rates)): + # Draw bar + ax.barh(i, width, height=0.8, color=colors[i], + edgecolor='white', linewidth=2) + + # Add user count + ax.text(width/2, i, f'{user_count:,} users', + ha='center', va='center', fontsize=11, fontweight='bold', color='white') + + # Add conversion rate (skip first step) + if i > 0 and not np.isnan(conv_rate): + # Calculate dropoff + dropoff = users[i-1] - users[i] + dropoff_pct = 100 - conv_rate + + # Annotate dropoff on the right + ax.text(1.05, i - 0.4, + f'↓ {dropoff:,} users dropped\n({dropoff_pct:.1f}% loss)', + ha='left', va='center', fontsize=9, + color='#D32F2F', style='italic') + + # Formatting + ax.set_yticks(y_positions) + ax.set_yticklabels(steps, fontsize=11) + ax.set_xlim(0, 1.3) + ax.set_xlabel('Relative Volume', fontsize=12, fontweight='bold') + ax.set_title('Journals Adoption Funnel (Within First Session)\nWeek 1 Launch Data', + fontsize=14, fontweight='bold', pad=20) + + # Remove spines + ax.spines['top'].set_visible(False) + ax.spines['right'].set_visible(False) + ax.spines['bottom'].set_visible(False) + ax.get_xaxis().set_visible(False) + + # Add overall conversion rate + overall_conversion = (users[-1] / users[0]) * 100 + fig.text(0.5, 0.02, + f'Overall Conversion Rate: {overall_conversion:.1f}% | {users[-1]:,} / {users[0]:,} users completed the full funnel', + ha='center', fontsize=11, style='italic', color='#555555') + + plt.tight_layout() + plt.savefig('adoption_funnel.png', dpi=300, bbox_inches='tight') + return fig + +# Usage example with sample data +# funnel_df = pd.DataFrame({ +# 'funnel_step': ['App Open', 'View Feed', 'Tap Icon', 'Create Entry'], +# 'users': [50000, 45000, 18000, 12000], +# 'step_conversion_rate': [100, 90, 40, 66.7] +# }) +# create_funnel_visualization(funnel_df) +``` + +### Enhanced Visualization with Segment Comparison + +```python +def create_segmented_funnel(funnel_data_dict): + """ + Create side-by-side funnel comparison (e.g., iOS vs Android) + + Parameters: + funnel_data_dict: Dictionary with segment names as keys and DataFrames as values + """ + fig, axes = plt.subplots(1, len(funnel_data_dict), + figsize=(7*len(funnel_data_dict), 8)) + + if len(funnel_data_dict) == 1: + axes = [axes] + + for idx, (segment_name, funnel_df) in enumerate(funnel_data_dict.items()): + ax = axes[idx] + + steps = funnel_df['funnel_step'].values + users = funnel_df['users'].values + max_users = users[0] + widths = users / max_users + + y_positions = np.arange(len(steps)) + colors = plt.cm.Oranges(np.linspace(0.4, 0.8, len(steps))) if idx == 0 else plt.cm.Blues(np.linspace(0.4, 0.8, len(steps))) + + for i, (step, width, user_count) in enumerate(zip(steps, widths, users)): + ax.barh(i, width, height=0.8, color=colors[i], + edgecolor='white', linewidth=2) + ax.text(width/2, i, f'{user_count:,}', + ha='center', va='center', fontsize=10, + fontweight='bold', color='white') + + ax.set_yticks(y_positions) + ax.set_yticklabels(steps, fontsize=10) + ax.set_xlim(0, 1.1) + ax.set_title(f'{segment_name}\nConversion: {(users[-1]/users[0]*100):.1f}%', + fontsize=12, fontweight='bold') + ax.spines['top'].set_visible(False) + ax.spines['right'].set_visible(False) + ax.get_xaxis().set_visible(False) + + plt.tight_layout() + plt.savefig('segmented_funnel_comparison.png', dpi=300, bbox_inches='tight') + return fig +``` + +## Task 4: Identify Leakage & Formulate a Product Hypothesis + +### Leakage Analysis Framework + +Annotate your funnel chart to highlight the biggest drop-off point. Below the chart, write a specific, testable product hypothesis. + +#### Example Analysis + +**Sample Funnel Results:** +``` +Step 1: App Open → 50,000 users (100%) +Step 2: View Main Feed → 45,000 users (90%) | -5,000 users (-10%) +Step 3: Tap Journals Icon → 18,000 users (36%) | -27,000 users (-60%) ⚠️ CRITICAL LEAKAGE +Step 4: Create First Entry → 12,000 users (24%) | -6,000 users (-33%) +``` + +**Insight:** The 60% drop-off between viewing the feed and tapping the icon is the critical friction point. + +### The Product Hypothesis Template + +```markdown +## Funnel Leakage Analysis + +### Critical Drop-Off Point Identified + +**Location:** Step 2 → Step 3 (View Main Feed → Tap Journals Icon) + +**Magnitude:** +- **Absolute Loss:** 27,000 users (60% of Step 2 users) +- **Impact on Overall Conversion:** If fixed to 80% conversion, overall funnel conversion + would increase from 24% to 64% (+167% relative lift) + +### Root Cause Hypothesis + +**Primary Hypothesis:** Low discoverability of the Journals feature entry point + +**Supporting Evidence:** +1. **High Feed Engagement:** 90% of users successfully reach the main feed, indicating + the app experience up to this point is working +2. **Massive Drop at Discovery:** The journal icon is not prominent enough to capture + user attention in the busy feed interface +3. **Good Post-Discovery Conversion:** Once users tap the icon, 67% create an entry, + suggesting the feature itself has strong value proposition + +**Alternative Hypotheses to Rule Out:** +- ❌ "Users don't want the feature" → Contradicted by 67% conversion after discovery +- ❌ "The icon is confusing" → Would manifest as high bounce rate after tap +- ✅ "The icon is invisible/not prominent" → Matches the data pattern + +### Testable Product Intervention + +**Proposed Change:** Redesign the Journals entry point to increase visibility + +**Specific Tactics:** +1. **Visual Prominence:** Change icon color from gray to brand purple +2. **Novelty Badge:** Add a "New" badge for first 14 days post-launch +3. **Positioning:** Move icon from bottom-right to top-right of feed +4. **Animation:** Add subtle pulse animation on first 3 app opens post-launch + +**Success Metric:** Increase tap-through rate (Step 2 → Step 3) from 40% to 60% + +**A/B Test Design:** +- **Control:** Current icon placement and styling +- **Treatment:** Enhanced visibility (all 4 tactics above) +- **Primary Metric:** Tap-through rate (View Feed → Tap Icon) +- **Sample Size:** 10,000 users per group +- **Duration:** 7 days +- **Decision Criteria:** Ship if treatment shows >15% relative lift with p < 0.05 + +### Expected Impact + +If successful, this intervention would: +- **Increase overall funnel conversion** from 24% to ~40% (+67% relative) +- **Add ~8,000 new journal adopters** in the first week +- **Improve feature ROI** by reducing the cost-per-acquisition of engaged users +``` + +## Advanced Funnel Analysis Techniques + +### Time-to-Convert Analysis + +How long does it take users to complete the funnel? + +```sql +-- Time Between Funnel Steps +WITH step_times AS ( + SELECT + session_id, + user_id, + MIN(CASE WHEN event_name = 'app_open' THEN event_timestamp END) AS step_1_time, + MIN(CASE WHEN event_name = 'view_main_feed' THEN event_timestamp END) AS step_2_time, + MIN(CASE WHEN event_name = 'tap_journals_icon' THEN event_timestamp END) AS step_3_time, + MIN(CASE WHEN event_name = 'create_journal_entry' THEN event_timestamp END) AS step_4_time + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '7 days' + AND event_name IN ('app_open', 'view_main_feed', 'tap_journals_icon', 'create_journal_entry') + GROUP BY session_id, user_id + HAVING MIN(CASE WHEN event_name = 'create_journal_entry' THEN event_timestamp END) IS NOT NULL +), +time_deltas AS ( + SELECT + EXTRACT(EPOCH FROM (step_2_time - step_1_time)) AS seconds_step_1_to_2, + EXTRACT(EPOCH FROM (step_3_time - step_2_time)) AS seconds_step_2_to_3, + EXTRACT(EPOCH FROM (step_4_time - step_3_time)) AS seconds_step_3_to_4, + EXTRACT(EPOCH FROM (step_4_time - step_1_time)) AS total_funnel_seconds + FROM step_times + WHERE step_2_time > step_1_time + AND step_3_time > step_2_time + AND step_4_time > step_3_time +) +SELECT + ROUND(AVG(seconds_step_1_to_2), 1) AS avg_seconds_to_feed, + ROUND(AVG(seconds_step_2_to_3), 1) AS avg_seconds_to_icon_tap, + ROUND(AVG(seconds_step_3_to_4), 1) AS avg_seconds_to_entry, + ROUND(AVG(total_funnel_seconds), 1) AS avg_total_funnel_time, + ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_funnel_seconds), 1) AS median_funnel_time, + ROUND(PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY total_funnel_seconds), 1) AS p90_funnel_time +FROM time_deltas; +``` + +### Funnel Segmentation + +Compare funnel performance across user segments: + +```sql +-- Segmented Funnel Analysis (iOS vs Android) +WITH session_events AS ( + SELECT + e.user_id, + e.session_id, + e.event_name, + e.event_timestamp, + u.platform + FROM events e + JOIN users u ON e.user_id = u.user_id + WHERE e.event_timestamp >= CURRENT_DATE - INTERVAL '7 days' + AND e.event_name IN ('app_open', 'view_main_feed', 'tap_journals_icon', 'create_journal_entry') +), +-- Repeat the funnel logic for each segment +platform_funnels AS ( + SELECT + platform, + COUNT(DISTINCT CASE WHEN event_name = 'app_open' THEN session_id END) AS step_1, + COUNT(DISTINCT CASE WHEN event_name = 'view_main_feed' THEN session_id END) AS step_2, + COUNT(DISTINCT CASE WHEN event_name = 'tap_journals_icon' THEN session_id END) AS step_3, + COUNT(DISTINCT CASE WHEN event_name = 'create_journal_entry' THEN session_id END) AS step_4 + FROM session_events + GROUP BY platform +) +SELECT + platform, + step_1, + step_2, + step_3, + step_4, + ROUND(100.0 * step_2 / step_1, 2) AS conversion_1_to_2, + ROUND(100.0 * step_3 / step_2, 2) AS conversion_2_to_3, + ROUND(100.0 * step_4 / step_3, 2) AS conversion_3_to_4, + ROUND(100.0 * step_4 / step_1, 2) AS overall_conversion +FROM platform_funnels +ORDER BY overall_conversion DESC; +``` + +## Common Funnel Analysis Pitfalls + +### ❌ Mistake 1: Not Validating Sequential Order +**Problem:** Counting a user who did Step 3 before Step 2 +**Solution:** Always validate timestamp ordering in your query + +### ❌ Mistake 2: Allowing Multi-Session Funnels +**Problem:** A user who takes 3 days to complete the funnel looks like a "success" +**Solution:** Enforce same `session_id` constraint + +### ❌ Mistake 3: Reporting Only Percentages +**Problem:** "60% drop-off" doesn't communicate absolute impact +**Solution:** Always include both absolute (27,000 users) and relative (60%) numbers + +### ❌ Mistake 4: Not Considering Sample Size +**Problem:** Drawing conclusions from a funnel with 50 users +**Solution:** Report confidence intervals or flag small sample sizes + +### ❌ Mistake 5: Analyzing Without Context +**Problem:** "40% conversion is bad" +**Solution:** Compare to industry benchmarks, similar features, or A/B test variations + +## Deliverable Checklist + +- [ ] `10_adoption_funnel.ipynb` notebook created +- [ ] Session-based funnel query implemented with sequential validation +- [ ] Funnel visualization created with clear annotations +- [ ] Biggest drop-off point identified and highlighted +- [ ] Time-to-convert analysis completed +- [ ] Segmented funnel comparison (e.g., iOS vs Android) performed +- [ ] Root cause hypothesis formulated +- [ ] Product intervention designed with specific tactics +- [ ] A/B test plan drafted for proposed solution +- [ ] Expected impact quantified + +## Key Takeaways + +1. **The funnel reveals friction:** Where users drop off tells you where your UX is failing +2. **Session-based is critical:** Multi-session funnels hide true UX problems +3. **Absolute + relative matters:** "60% drop = 27K users" is more compelling than just percentages +4. **Hypotheses must be testable:** Vague ideas don't drive action; specific interventions do +5. **Segmentation reveals insights:** What works for iOS may not work for Android + +--- + +**Remember:** The adoption funnel is not just a diagnostic tool—it's a strategic weapon. When you can pinpoint exactly where users are struggling and propose a data-backed solution, you transform from a reporter of problems into a driver of solutions. Your funnel analysis today becomes tomorrow's product roadmap. diff --git a/book-src/src/week-2/day-11.md b/book-src/src/week-2/day-11.md index 4e89f32..4bec537 100644 --- a/book-src/src/week-2/day-11.md +++ b/book-src/src/week-2/day-11.md @@ -1,3 +1,608 @@ # Day 11: The "Aha!" Moment – Finding the Magic Action -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To find the early user action that most strongly correlates with long-term feature retention, while understanding the limits of correlation. + +**Why This Matters:** The "Aha!" moment is where a user internalizes a product's value. Identifying it gives the product team a powerful lever to improve user onboarding and drive habit formation. + +## Critical Thinking Check: Correlation ≠ Causation + +**⚠️ IMPORTANT:** This analysis reveals **correlation, not causation**. + +A user who adds a photo to their first journal entry might be: +- More motivated from the start (selection bias) +- More engaged with the app generally (confounding factor) +- Experiencing the true value-unlocking action (causal relationship) + +**Your job:** Find the signal and frame it correctly as a **hypothesis to be tested** via A/B test, not as proven truth. + +## The "Aha!" Moment Framework + +The "Aha!" moment is when a user thinks: *"Now I get it. This is valuable."* + +**Famous Examples:** +- **Facebook:** "Connect with 7 friends in 10 days" +- **Dropbox:** "Save first file to a shared folder" +- **Slack:** "Send 2,000 messages as a team" +- **Twitter:** "Follow 30 accounts" + +These aren't just actions—they're **early predictors of long-term engagement**. The companies discovered them through analysis like what you're about to do. + +## Task 1: Define User Cohorts + +Create two distinct behavioral groups based on first 7 days of feature usage. + +### Cohort Definitions + +```sql +-- Define Engaged & Retained Cohort +WITH user_first_week_behavior AS ( + SELECT + user_id, + MIN(event_timestamp) AS first_interaction, + COUNT(DISTINCT CASE + WHEN event_name = 'create_journal_entry' + THEN event_timestamp::DATE + END) AS journal_days, + COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') AS total_entries + FROM events + WHERE event_name IN ('create_journal_entry', 'view_journal', 'edit_journal_entry') + AND event_timestamp >= ( + SELECT MIN(event_timestamp) + FROM events + WHERE event_name = 'create_journal_entry' + ) -- Start from feature launch + GROUP BY user_id + HAVING MIN(event_timestamp) >= CURRENT_DATE - INTERVAL '14 days' -- Users who adopted in last 14 days +), +cohort_classification AS ( + SELECT + user_id, + first_interaction, + journal_days, + total_entries, + CASE + WHEN total_entries >= 3 THEN 'Engaged & Retained' + WHEN total_entries = 1 THEN 'Churned Adopters' + ELSE 'Other' + END AS cohort + FROM user_first_week_behavior + WHERE first_interaction >= CURRENT_DATE - INTERVAL '14 days' + AND first_interaction < CURRENT_DATE - INTERVAL '7 days' -- Ensure full 7-day window +) +SELECT + cohort, + COUNT(DISTINCT user_id) AS user_count, + AVG(total_entries) AS avg_entries, + AVG(journal_days) AS avg_active_days +FROM cohort_classification +WHERE cohort IN ('Engaged & Retained', 'Churned Adopters') +GROUP BY cohort; +``` + +### Validation: Cohort Size Check + +**Best Practice:** Ensure both cohorts have sufficient sample sizes (ideally 100+ users each). + +```sql +-- Quick cohort size validation +SELECT + CASE + WHEN total_entries >= 3 THEN 'Engaged & Retained' + WHEN total_entries = 1 THEN 'Churned Adopters' + END AS cohort, + COUNT(*) AS cohort_size +FROM ( + SELECT + user_id, + COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') AS total_entries + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '14 days' + AND event_timestamp < CURRENT_DATE - INTERVAL '7 days' + GROUP BY user_id +) user_entries +WHERE total_entries IN (1) OR total_entries >= 3 +GROUP BY cohort; +``` + +## Task 2: Analyze First-Session Behavior + +Focus exclusively on actions during the **very first session** with the feature. + +### First-Session Action Analysis + +```sql +-- Comprehensive First Session Behavior Analysis +WITH user_cohorts AS ( + SELECT + user_id, + CASE + WHEN COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') >= 3 THEN 'Engaged' + WHEN COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') = 1 THEN 'Churned' + END AS cohort + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '14 days' + AND event_timestamp < CURRENT_DATE - INTERVAL '7 days' + GROUP BY user_id + HAVING CASE + WHEN COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') >= 3 THEN 'Engaged' + WHEN COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') = 1 THEN 'Churned' + END IS NOT NULL +), +first_sessions AS ( + SELECT + e.user_id, + e.session_id, + MIN(e.event_timestamp) AS session_start, + ROW_NUMBER() OVER (PARTITION BY e.user_id ORDER BY MIN(e.event_timestamp)) AS session_rank + FROM events e + WHERE event_name = 'create_journal_entry' + GROUP BY e.user_id, e.session_id +), +first_session_ids AS ( + SELECT + user_id, + session_id AS first_session_id + FROM first_sessions + WHERE session_rank = 1 +), +first_session_actions AS ( + SELECT + e.user_id, + uc.cohort, + -- Key Actions + MAX(CASE WHEN ep.property_name = 'template_used' AND ep.property_value != 'none' THEN 1 ELSE 0 END) AS used_template, + MAX(CASE WHEN ep.property_name = 'has_photo' AND ep.property_value = 'true' THEN 1 ELSE 0 END) AS added_photo, + MAX(CASE WHEN ep.property_name = 'entry_length' AND CAST(ep.property_value AS INTEGER) > 100 THEN 1 ELSE 0 END) AS wrote_over_100_chars, + MAX(CASE WHEN e.event_name = 'apply_journal_tag' THEN 1 ELSE 0 END) AS used_tags, + MAX(CASE WHEN e.event_name = 'set_journal_mood' THEN 1 ELSE 0 END) AS set_mood, + -- Session metrics + COUNT(DISTINCT e.event_name) AS distinct_event_types, + COUNT(*) AS total_events, + MAX(e.event_timestamp) - MIN(e.event_timestamp) AS session_duration + FROM events e + INNER JOIN first_session_ids fs ON e.user_id = fs.user_id AND e.session_id = fs.first_session_id + INNER JOIN user_cohorts uc ON e.user_id = uc.user_id + LEFT JOIN event_properties ep ON e.event_id = ep.event_id + GROUP BY e.user_id, uc.cohort +) +SELECT + cohort, + COUNT(*) AS cohort_size, + -- Action completion rates + ROUND(100.0 * SUM(used_template) / COUNT(*), 2) AS pct_used_template, + ROUND(100.0 * SUM(added_photo) / COUNT(*), 2) AS pct_added_photo, + ROUND(100.0 * SUM(wrote_over_100_chars) / COUNT(*), 2) AS pct_wrote_long_entry, + ROUND(100.0 * SUM(used_tags) / COUNT(*), 2) AS pct_used_tags, + ROUND(100.0 * SUM(set_mood) / COUNT(*), 2) AS pct_set_mood, + -- Session depth metrics + ROUND(AVG(distinct_event_types), 1) AS avg_distinct_actions, + ROUND(AVG(total_events), 1) AS avg_total_events, + ROUND(AVG(EXTRACT(EPOCH FROM session_duration)), 1) AS avg_session_seconds +FROM first_session_actions +GROUP BY cohort +ORDER BY cohort; +``` + +### Simplified Query (If Event Properties Not Available) + +```sql +-- Simplified version using only event names +WITH user_cohorts AS ( + SELECT + user_id, + CASE + WHEN COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') >= 3 THEN 'Engaged' + WHEN COUNT(*) FILTER (WHERE event_name = 'create_journal_entry') = 1 THEN 'Churned' + END AS cohort + FROM events + WHERE event_timestamp >= CURRENT_DATE - INTERVAL '14 days' + AND event_timestamp < CURRENT_DATE - INTERVAL '7 days' + GROUP BY user_id +), +first_journal_session AS ( + SELECT + user_id, + session_id, + MIN(event_timestamp) AS first_journal_time, + ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY MIN(event_timestamp)) AS session_rank + FROM events + WHERE event_name = 'create_journal_entry' + GROUP BY user_id, session_id +), +first_session_events AS ( + SELECT + e.user_id, + uc.cohort, + MAX(CASE WHEN e.event_name = 'upload_journal_photo' THEN 1 ELSE 0 END) AS added_photo, + MAX(CASE WHEN e.event_name = 'use_journal_template' THEN 1 ELSE 0 END) AS used_template, + MAX(CASE WHEN e.event_name = 'share_journal_entry' THEN 1 ELSE 0 END) AS shared_entry, + COUNT(DISTINCT e.event_name) AS unique_actions + FROM events e + INNER JOIN first_journal_session fjs + ON e.user_id = fjs.user_id + AND e.session_id = fjs.session_id + AND fjs.session_rank = 1 + INNER JOIN user_cohorts uc ON e.user_id = uc.user_id + WHERE uc.cohort IS NOT NULL + GROUP BY e.user_id, uc.cohort +) +SELECT + cohort, + COUNT(*) AS users, + ROUND(100.0 * SUM(added_photo) / COUNT(*), 2) AS pct_added_photo, + ROUND(100.0 * SUM(used_template) / COUNT(*), 2) AS pct_used_template, + ROUND(100.0 * SUM(shared_entry) / COUNT(*), 2) AS pct_shared_entry, + ROUND(AVG(unique_actions), 1) AS avg_unique_actions +FROM first_session_events +GROUP BY cohort; +``` + +## Task 3: Isolate the Strongest Signal + +Find the action with the largest *relative difference* between cohorts. + +### Signal Strength Calculation + +```python +import pandas as pd +import numpy as np +from scipy import stats + +def calculate_signal_strength(cohort_data): + """ + Calculate the strength of various signals for predicting engagement + + Parameters: + cohort_data: DataFrame with columns ['cohort', 'users', 'pct_action_X'] + + Returns: + DataFrame with signal strength metrics + """ + engaged = cohort_data[cohort_data['cohort'] == 'Engaged'].iloc[0] + churned = cohort_data[cohort_data['cohort'] == 'Churned'].iloc[0] + + actions = [col for col in cohort_data.columns if col.startswith('pct_')] + + results = [] + for action in actions: + engaged_rate = engaged[action] + churned_rate = churned[action] + + # Calculate relative difference + if churned_rate > 0: + relative_lift = ((engaged_rate - churned_rate) / churned_rate) * 100 + else: + relative_lift = float('inf') if engaged_rate > 0 else 0 + + # Calculate absolute difference + absolute_diff = engaged_rate - churned_rate + + # Statistical significance test (Chi-square approximation) + engaged_count_action = int((engaged_rate / 100) * engaged['users']) + churned_count_action = int((churned_rate / 100) * churned['users']) + engaged_count_no_action = engaged['users'] - engaged_count_action + churned_count_no_action = churned['users'] - churned_count_action + + contingency_table = [ + [engaged_count_action, engaged_count_no_action], + [churned_count_action, churned_count_no_action] + ] + + chi2, p_value, _, _ = stats.chi2_contingency(contingency_table) + + results.append({ + 'action': action.replace('pct_', '').replace('_', ' ').title(), + 'engaged_rate': engaged_rate, + 'churned_rate': churned_rate, + 'absolute_diff': absolute_diff, + 'relative_lift': relative_lift, + 'p_value': p_value, + 'statistically_significant': p_value < 0.05 + }) + + results_df = pd.DataFrame(results) + results_df = results_df.sort_values('relative_lift', ascending=False) + + return results_df + +# Visualization +def visualize_aha_moments(signal_df): + """Create a visual comparison of potential Aha moments""" + import matplotlib.pyplot as plt + + fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6)) + + # Chart 1: Comparison of action completion rates + actions = signal_df['action'].head(5) + x = np.arange(len(actions)) + width = 0.35 + + engaged_rates = signal_df['engaged_rate'].head(5) + churned_rates = signal_df['churned_rate'].head(5) + + bars1 = ax1.bar(x - width/2, engaged_rates, width, label='Engaged Users', color='#2E7D32') + bars2 = ax1.bar(x + width/2, churned_rates, width, label='Churned Users', color='#C62828') + + ax1.set_ylabel('Completion Rate (%)', fontsize=12, fontweight='bold') + ax1.set_title('First Session Actions: Engaged vs Churned Users', fontsize=14, fontweight='bold') + ax1.set_xticks(x) + ax1.set_xticklabels(actions, rotation=45, ha='right') + ax1.legend() + ax1.grid(axis='y', alpha=0.3) + + # Chart 2: Relative lift (signal strength) + signal_strength = signal_df['relative_lift'].head(5) + colors = ['#1B5E20' if sig else '#666666' for sig in signal_df['statistically_significant'].head(5)] + + bars = ax2.barh(actions, signal_strength, color=colors) + ax2.set_xlabel('Relative Lift (%)', fontsize=12, fontweight='bold') + ax2.set_title('Signal Strength: Relative Lift in Engaged Users', fontsize=14, fontweight='bold') + ax2.axvline(x=0, color='black', linestyle='-', linewidth=0.8) + ax2.grid(axis='x', alpha=0.3) + + # Add value labels + for i, bar in enumerate(bars): + width = bar.get_width() + label = f'{width:.1f}%' + ax2.text(width, bar.get_y() + bar.get_height()/2, + label, ha='left', va='center', fontsize=10, fontweight='bold') + + plt.tight_layout() + plt.savefig('aha_moment_analysis.png', dpi=300, bbox_inches='tight') + return fig +``` + +## Task 4: Formulate a Careful Hypothesis + +In notebook `11_aha_moment_analysis.ipynb`, state your finding with analytical precision. + +### The Aha! Moment Report Template + +```markdown +# 🔍 "Aha!" Moment Analysis: First Session Predictors of Retention + +**Analysis Date:** [Current Date] +**Analyst:** [Your Name] +**Cohort Window:** Users who adopted Journals between [Start Date] - [End Date] +**Sample Size:** [N] Engaged Users, [N] Churned Users + +--- + +## Executive Summary + +We have identified a **strong correlation** between adding a photo during the first +journal entry and long-term feature retention. Users who add a photo are **3.2x more +likely** to become engaged users (defined as creating 3+ entries in first week). + +**⚠️ Important:** This finding represents correlation, not proven causation. We +recommend validating this hypothesis through an A/B test before making product changes. + +--- + +## Cohort Definitions + +### Engaged & Retained Users +- **Definition:** Created ≥ 3 journal entries in first 7 days +- **Sample Size:** 2,847 users (18.2% of all new adopters) +- **Behavior:** Returned to feature on average 4.3 days in first week + +### Churned Adopters +- **Definition:** Created exactly 1 journal entry and never returned +- **Sample Size:** 8,234 users (52.7% of all new adopters) +- **Behavior:** Single session interaction, no return + +--- + +## First-Session Action Comparison + +| Action | Engaged Users | Churned Users | Absolute Diff | Relative Lift | p-value | Significant? | +|--------|---------------|---------------|---------------|---------------|---------|--------------| +| **Added Photo** | **67.3%** | **21.2%** | **+46.1pp** | **+217%** | **< 0.001** | **✅ Yes** | +| Used Template | 34.5% | 28.1% | +6.4pp | +23% | 0.003 | ✅ Yes | +| Wrote >100 Chars | 52.1% | 44.3% | +7.8pp | +18% | 0.012 | ✅ Yes | +| Used Tags | 12.4% | 8.7% | +3.7pp | +43% | 0.041 | ✅ Yes | +| Set Mood | 23.1% | 19.5% | +3.6pp | +18% | 0.089 | ❌ No | + +--- + +## Key Finding: The Photo-Upload Signal + +### The Numbers + +**Engaged Users:** +- 67.3% added a photo in their first session +- 1,916 out of 2,847 users + +**Churned Users:** +- 21.2% added a photo in their first session +- 1,746 out of 8,234 users + +**Likelihood Ratio:** +Users who added a photo were **3.2x more likely** to become engaged users compared +to those who did not. + +### Statistical Validation + +- **Chi-square test:** χ² = 1,234.5, p < 0.001 +- **Effect Size (Cramér's V):** 0.33 (medium-to-large effect) +- **Confidence:** Very high statistical significance + +--- + +## Interpretation & Caveats + +### Why This Might Be Causal (The Optimistic View) + +1. **Emotional Investment:** Adding a photo requires more effort and thought, deepening + emotional connection to the entry +2. **Visual Memory Trigger:** Photos create stronger memory associations, making users + more likely to return to review past entries +3. **Perceived Value:** Users who upload photos may better understand the feature's + value proposition as a "visual diary" + +### Alternative Explanations (The Skeptical View) + +1. **Selection Bias:** Users motivated enough to add photos were already more likely + to engage long-term, regardless of the photo action itself +2. **Confounding Factors:** Photo uploaders may be generally more engaged with the + entire app (not just Journals) +3. **Reverse Causation:** Users who plan to use the feature seriously are more likely + to add photos from the start + +### The Honest Assessment + +**We cannot definitively say that adding a photo *causes* retention.** However, the +magnitude and statistical significance of this correlation makes it our **strongest +candidate for the "Aha!" moment** and warrants further investigation. + +--- + +## Recommended Next Steps + +### 1. Validate Through A/B Test (HIGH PRIORITY) + +**Hypothesis:** Encouraging photo uploads during onboarding will increase feature +retention by 15-25%. + +**Test Design:** +- **Control Group:** Current onboarding flow (no photo prompt) +- **Treatment Group:** Add a step prompting users to "Add a photo to your first entry + to make it memorable" +- **Primary Metric:** Day-7 retention (% of users who create ≥ 3 entries) +- **Sample Size:** 5,000 users per group +- **Duration:** 14 days +- **Success Criteria:** ≥10% relative lift in Day-7 retention, p < 0.05 + +### 2. Qualitative Validation (SUPPORTING) + +- Conduct 10 user interviews asking: "What made you decide to keep using Journals?" +- Analyze if photo-related comments emerge organically + +### 3. Product Implementation (IF TEST SUCCEEDS) + +**Specific Tactics:** +- Add photo upload prompt during first entry creation +- Show example entries with photos to inspire users +- Add a "Journals are better with photos" tooltip +- Track photo upload rate as a success metric + +--- + +## Expected Impact (If Causal Relationship Confirmed) + +If the A/B test validates this finding: + +**Retention Improvement:** +- Current engaged rate: 18.2% +- Projected engaged rate: 25-28% (+38-54% relative) +- Additional engaged users per week: ~900 users + +**Business Impact:** +- Increased feature stickiness improves overall app retention +- More engaged Journals users = higher LTV +- Stronger product-market fit for the feature + +--- + +## Appendix: Additional Findings + +### Secondary Signals + +While photo upload was the strongest signal, we also identified: + +1. **Template Usage** (+23% relative lift, p = 0.003) + - Suggests guided structure helps some users + - Opportunity: Create more diverse templates + +2. **Entry Length** (+18% relative lift, p = 0.012) + - Users who write more engage more + - May be result, not cause, of engagement + +### Non-Signals + +These actions showed minimal or non-significant differences: +- Setting mood emoji (not significant) +- Time spent in first session (weak correlation) + +--- + +## Conclusion + +We have discovered a **promising "Aha!" moment candidate**: adding a photo to the +first journal entry. The correlation is strong, statistically significant, and +behaviorally logical. + +**However**, correlation is not causation. Our **recommendation is to test, not +assume**. An A/B test validating this hypothesis would give us the confidence to +make this a core part of our onboarding strategy and potentially unlock a +significant retention improvement. + +**Next Action:** Present this finding to the product team and prioritize the +recommended A/B test in the next sprint. +``` + +## Advanced Analysis: Propensity Score Matching + +For a more sophisticated approach to control for confounders: + +```python +from sklearn.linear_model import LogisticRegression +from sklearn.preprocessing import StandardScaler + +def propensity_score_matching(user_data): + """ + Use propensity scores to better isolate the effect of the action + + This helps control for confounding variables like: + - Overall app engagement level + - User demographics + - Platform differences + """ + # Features that might confound the relationship + features = ['app_tenure_days', 'avg_daily_sessions', 'platform_ios', + 'total_app_actions'] + + X = user_data[features] + y = user_data['added_photo'] # Treatment variable + + # Calculate propensity scores + scaler = StandardScaler() + X_scaled = scaler.fit_transform(X) + + model = LogisticRegression() + model.fit(X_scaled, y) + + user_data['propensity_score'] = model.predict_proba(X_scaled)[:, 1] + + # Match users with similar propensity scores + # (Implementation of matching algorithm would go here) + + return user_data +``` + +## Deliverable Checklist + +- [ ] `11_aha_moment_analysis.ipynb` notebook created +- [ ] User cohorts defined (Engaged vs. Churned) +- [ ] First-session behavior analyzed comprehensively +- [ ] Signal strength calculated for all candidate actions +- [ ] Statistical significance tested for each action +- [ ] Strongest correlation identified +- [ ] Visualization comparing cohort behaviors created +- [ ] Careful hypothesis formulated with caveats +- [ ] A/B test design proposed +- [ ] Alternative explanations considered and documented + +## Key Takeaways + +1. **Correlation ≠ Causation:** Always frame findings as hypotheses, not proven facts +2. **Statistical significance matters:** Large sample sizes give you confidence +3. **Relative lift is more compelling than absolute:** "3x more likely" resonates more than "+46 percentage points" +4. **Consider alternatives:** The best analysts play devil's advocate with their own findings +5. **Test, don't assume:** The A/B test is your proof + +--- + +**Remember:** Finding the "Aha!" moment is detective work, not magic. You're looking for clues in user behavior patterns. When you find a strong signal, don't oversell it—present it honestly, with appropriate caveats, and with a clear path to validation. That intellectual honesty is what makes you a trusted advisor, not just a data reporter. diff --git a/book-src/src/week-2/day-12.md b/book-src/src/week-2/day-12.md index e62b236..22fe036 100644 --- a/book-src/src/week-2/day-12.md +++ b/book-src/src/week-2/day-12.md @@ -1,3 +1,477 @@ # Day 12: The Weekly Launch Memo – Communicating with Clarity -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To synthesize a week of complex findings into a clear, concise, and persuasive memo for leadership and the broader team. + +**Why This Matters:** Data only drives decisions when it is communicated effectively. This memo is your chance to shape the narrative, manage expectations, and guide the team's focus for the upcoming week. + +## The Art of Executive Communication + +**The Challenge:** You have 7 days of complex analysis. Your executive audience has 3 minutes. + +**The Solution:** Structure, clarity, and strategic framing. + +### What Executives Need + +1. **The Bottom Line First:** Start with the conclusion +2. **Context Without Complexity:** Provide enough detail to trust, not so much they get lost +3. **Action, Not Just Information:** Every insight should drive a decision or next step +4. **Honesty About Uncertainty:** Don't oversell; credibility matters more than optimism +5. **Visual Anchors:** One great chart beats ten paragraphs + +### What Executives Don't Need + +- ❌ Raw SQL queries or technical methodology +- ❌ Every metric you tracked +- ❌ Caveats buried at the end +- ❌ Passive voice and hedging language +- ❌ Problems without proposed solutions + +## The Memo Structure: The Pyramid Principle + +``` +TL;DR (1 sentence) + ↓ +3 Key Sections (Good/Bad/Insights) + ↓ +Actionable Recommendations + ↓ +Supporting Details (optional appendix) +``` + +## Task 1: Review the Week's Learnings + +Before writing, consolidate your findings from Days 8-11. + +### Synthesis Worksheet + +```markdown +## Week 1 Launch Data Synthesis + +### Day 08: Launch Health +**Key Finding:** [System status] +**Key Metric:** [One number that matters] + +### Day 09: Bug Triage +**Key Finding:** [Bug impact and scope] +**Resolution Status:** [Fixed/In Progress/Escalated] + +### Day 10: Adoption Funnel +**Key Finding:** [Biggest drop-off point] +**Implication:** [What this means for product] + +### Day 11: Aha Moment +**Key Finding:** [Strongest retention signal] +**Confidence Level:** [High/Medium - correlation only] +``` + +## Task 2: Draft the Weekly Launch Memo + +Create file `Week1_Launch_Summary.md` using this proven structure. + +### The Complete Memo Template + +```markdown +# 'Journals' Feature Launch: Week 1 Data Summary & Recommendations + +**To:** Product Leadership, Engineering Leadership, Marketing +**From:** [Your Name], Product Analytics +**Date:** [Current Date] +**Re:** Week 1 Launch Performance & Recommended Actions + +--- + +## TL;DR + +**Launch is stable with promising early adoption (15% ahead of forecast), but a +critical discoverability issue is limiting reach. Recommendation: Prioritize icon +redesign A/B test for Week 2.** + +--- + +## The Good (Wins) ✅ + +### 1. Stable Technical Launch +- **Zero critical system issues** detected across 50,000+ daily active users +- Crash rate remained at baseline 0.08% (well below 0.2% alert threshold) +- Server latency stable at 245ms avg (target: <375ms) + +**Implication:** Engineering team delivered a solid, production-ready feature. +No technical debt incurred. + +### 2. Adoption Exceeds Forecast +- **12,470 users created their first journal entry** in Week 1 +- **15% ahead of our 10,000-user forecast** +- Adoption trending upward: +12% day-over-day growth rate + +**Implication:** User demand for this feature is validated. The business case +holds. + +### 3. Strong Post-Adoption Engagement +- **34% of new adopters created multiple entries** within first 24 hours +- **Average 2.3 entries per engaged user** in first week +- **67% of engaged users added photos** to their entries + +**Implication:** Once users discover the feature, they find it valuable. The +feature "works." + +--- + +## The Bad (Challenges) ⚠️ + +### 1. Critical Discoverability Problem +- **60% of users drop off** between viewing the feed and tapping the Journals icon +- **Only 36% of feed viewers** discover the feature +- **Estimated 27,000 users per week** are missing the feature entirely + +**Impact:** This single friction point is costing us ~22,000 potential adopters +per week (compared to 80% discovery rate baseline). + +**Root Cause Hypothesis:** Icon placement and visual prominence insufficient. +Users scrolling quickly past the entry point. + +### 2. Android Crash Incident (Resolved) +- **Temporary spike in crashes** detected on Day 1 (Samsung Galaxy S21 + Android 12) +- **2,847 users affected** (3.2% of Android base) +- **Resolved within 4 hours** via hotfix deployment + +**Impact:** Minimal long-term damage due to rapid response. Estimated 50-100 +users may have churned before fix. + +--- + +## The Insights (Learnings) 💡 + +### 1. We've Found a Candidate "Aha!" Moment +**Finding:** Users who add a photo to their first entry are **3.2x more likely** +to become retained users (3+ entries in Week 1). + +- Engaged users: 67.3% added photos in first session +- Churned users: 21.2% added photos in first session +- Statistical significance: p < 0.001 + +**Critical Caveat:** This is correlation, not causation. Selection bias is +possible (motivated users add photos AND engage more). + +**What This Means:** We have a testable hypothesis for improving onboarding. +Encouraging photo uploads during first entry could unlock retention gains. + +### 2. Feature Doesn't Cannibalize Core Product (Yet) +- **No statistically significant decrease** in time spent on main feed +- **No spike in app uninstalls** beyond baseline variance +- Users treating Journals as additive, not substitutive + +**What This Means:** The feature is growing the value of the app, not just +shifting behavior. This is the ideal outcome. + +--- + +## Actionable Recommendations + +Based on the data, our priorities for Week 2 are: + +### Priority 1: Fix Discoverability (High Impact, High Confidence) +**Action:** Design and launch A/B test for icon redesign by Day 16 + +**Proposed Changes:** +- Move icon to top-right of feed (from bottom-right) +- Change color from gray to brand purple +- Add "New" badge for first 14 days +- Add subtle pulse animation on first 3 app opens + +**Expected Impact:** Increase tap-through rate from 40% to 60-80%, adding +15,000-30,000 adopters per week. + +**Owner:** Design + Engineering +**Timeline:** Spec by Day 15, Ship by Day 18 + +--- + +### Priority 2: Test Photo-Upload Encouragement (Medium Impact, Medium Confidence) +**Action:** Scope onboarding flow that prompts photo upload + +**Design Approach:** +- After first entry text is written, show prompt: "Add a photo to make this + memory last" +- Include 3 example entries with photos for inspiration +- Make skippable (don't force) + +**Expected Impact:** If photo-upload correlation is causal, could improve +7-day retention by 10-15%. + +**Owner:** Product + Design +**Timeline:** Spec in Week 2, A/B test in Week 3 + +--- + +### Priority 3: Expand Instrumentation (Low Impact, High Value Long-Term) +**Action:** Add two missing events to deepen future analysis + +**Missing Events:** +1. `journal_entry_revisited` (user views a past entry) +2. `journal_streak_achieved` (user hits 3, 7, 14-day streaks) + +**Rationale:** Need to track re-engagement patterns and habit formation for +deeper retention analysis in Weeks 3-4. + +**Owner:** Engineering (Analytics Team) +**Timeline:** Instrumented by Day 17 + +--- + +## Supporting Metrics + +| Metric | Week 1 Actual | Forecast | Status | +|--------|---------------|----------|--------| +| New Adopters | 12,470 | 10,000 | ✅ +24.7% | +| Adoption Rate (% of WAU) | 8.3% | 7.5% | ✅ +0.8pp | +| Engaged Users (3+ entries) | 2,847 | 2,500 | ✅ +13.9% | +| Avg Entries per User | 2.3 | 2.0 | ✅ +15% | +| Crash Rate | 0.08% | <0.2% | ✅ Pass | +| Discovery Rate (Tap Icon) | 36% | - | ⚠️ Below Expectations | + +--- + +## Next Week's Focus + +**Week 2 will be about optimization, not just monitoring.** + +We've confirmed the feature works. Now we need to: +1. Make it easier to find (discoverability) +2. Accelerate the path to value (onboarding) +3. Validate our "aha moment" hypothesis (photo uploads) + +**Reporting Cadence:** +- **Daily:** System health checks (continue through Day 14) +- **Weekly:** Full update memo (next: Day 19) +- **Ad-Hoc:** Immediate alerts for anomalies only + +--- + +## Appendix: Detailed Analysis + +For full methodology and queries, see: +- `08_launch_monitoring.ipynb` - System health analysis +- `09_bug_triage.ipynb` - Android crash investigation +- `10_adoption_funnel.ipynb` - User journey analysis +- `11_aha_moment_analysis.ipynb` - Retention correlation study + +**Questions?** Reach out via Slack @analytics or email. + +--- + +*This memo represents the analytical team's assessment based on available data. +All recommendations are subject to product and engineering feasibility review.* +``` + +## Alternative Format: The 3-2-1 Structure + +For even more concise communication: + +### The 3-2-1 Framework + +```markdown +# Journals Launch: Week 1 Summary + +## 3 Things That Went Well + +1. **Stable Launch** - Zero critical issues, 12,470 adopters (15% ahead of forecast) +2. **Strong Engagement** - 34% of adopters created multiple entries in first 24h +3. **No Cannibalization** - Core product metrics unaffected + +## 2 Things That Need Attention + +1. **Discoverability Crisis** - 60% drop-off before feature discovery; need icon redesign +2. **Android Incident** - Resolved in 4h, but exposed QA gap for device-specific testing + +## 1 Key Action for Next Week + +**Launch icon redesign A/B test** - Projected to add 15K-30K adopters/week +``` + +## Best Practices for Memo Writing + +### The Language of Impact + +**❌ Weak:** +"Some users are having trouble finding the feature." + +**✅ Strong:** +"60% of users (27,000 per week) drop off before discovering Journals due to icon +placement." + +--- + +**❌ Weak:** +"Photo uploads might be important." + +**✅ Strong:** +"Users who add photos are 3.2x more likely to retain (p<0.001), making this our +strongest candidate for the 'aha moment'." + +--- + +**❌ Weak:** +"We should probably test some changes." + +**✅ Strong:** +"Recommendation: Launch icon redesign A/B test by Day 18, projected to add +15,000-30,000 adopters per week." + +### The Honesty Hierarchy + +Be appropriately confident based on your evidence: + +1. **Proven (A/B test results):** "The feature caused a 2.5% lift in retention." +2. **High Confidence (Strong data + clear mechanism):** "The discoverability issue is limiting adoption." +3. **Medium Confidence (Correlation + logic):** "Photo uploads likely represent the 'aha moment'." +4. **Hypothesis Only:** "We believe X, and propose testing it via..." +5. **Speculation:** Avoid entirely in executive memos. + +### Visual Communication + +Include ONE key chart that tells the story: + +```python +import matplotlib.pyplot as plt +import numpy as np + +def create_week1_summary_chart(): + """ + Create the one chart that matters for the memo + """ + fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5)) + + # Chart 1: Daily Adoption Trend + days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'] + adopters = [1200, 1450, 1680, 1890, 2100, 2050, 2100] + forecast = [1400] * 7 + + ax1.plot(days, adopters, marker='o', linewidth=3, + label='Actual Adopters', color='#2E7D32', markersize=8) + ax1.plot(days, forecast, linestyle='--', linewidth=2, + label='Forecast', color='#666666') + ax1.fill_between(range(7), adopters, forecast, + where=[a >= f for a, f in zip(adopters, forecast)], + alpha=0.3, color='#2E7D32', label='Above Forecast') + ax1.set_ylabel('New Adopters', fontsize=12, fontweight='bold') + ax1.set_title('Daily Adoption: Trending Above Forecast', + fontsize=13, fontweight='bold') + ax1.legend(loc='upper left') + ax1.grid(True, alpha=0.3) + + # Chart 2: Funnel with Drop-off Highlight + steps = ['View\nFeed', 'Tap\nIcon', 'Create\nEntry'] + users = [45000, 18000, 12000] + colors = ['#2196F3', '#FF5722', '#4CAF50'] + + bars = ax2.bar(steps, users, color=colors, alpha=0.8, edgecolor='white', linewidth=2) + + # Annotate the critical drop + ax2.annotate('60% DROP-OFF\n(Critical Issue)', + xy=(0.5, 31500), xytext=(0.5, 40000), + fontsize=11, fontweight='bold', color='#D32F2F', + ha='center', + arrowprops=dict(arrowstyle='->', color='#D32F2F', lw=2)) + + ax2.set_ylabel('Users', fontsize=12, fontweight='bold') + ax2.set_title('Adoption Funnel: Discoverability Bottleneck', + fontsize=13, fontweight='bold') + ax2.set_ylim(0, 50000) + + # Add value labels + for bar in bars: + height = bar.get_height() + ax2.text(bar.get_x() + bar.get_width()/2., height, + f'{int(height):,}', + ha='center', va='bottom', fontsize=11, fontweight='bold') + + plt.tight_layout() + plt.savefig('week1_summary_for_memo.png', dpi=300, bbox_inches='tight') + return fig +``` + +## Common Memo Mistakes to Avoid + +### ❌ Mistake 1: Burying the Lead +**Bad:** +"This week we tracked 37 different metrics across 4 platforms. Let me start with +the methodology..." + +**Good:** +"TL;DR: Launch stable, adoption strong, but 60% of users can't find the feature." + +### ❌ Mistake 2: Data Dumping +**Bad:** +Including every metric, every query result, every p-value + +**Good:** +Choose the 3-5 numbers that drive decisions. Link to detailed appendix. + +### ❌ Mistake 3: No Clear Actions +**Bad:** +"We found some interesting patterns that might be worth exploring further." + +**Good:** +"Recommendation: Launch icon redesign A/B test by Day 18. Owner: Design. Expected +impact: +20K adopters/week." + +### ❌ Mistake 4: Overselling Uncertainty +**Bad:** +"Users who add photos might possibly be somewhat more likely to maybe engage more." + +**Good:** +"Users who add photos are 3.2x more likely to retain. This is correlation; we +recommend validating via A/B test." + +### ❌ Mistake 5: Ignoring Bad News +**Bad:** +Only reporting wins, hiding the crash incident + +**Good:** +"Android crash affected 2,847 users but was resolved in 4h. Post-mortem scheduled." + +## The Follow-Up: Responding to Questions + +Anticipate these common executive questions: + +**Q: "Should we kill the feature or double down?"** +A: "Double down. Demand is validated (15% ahead of forecast). Fix discoverability, +and adoption could 2-3x." + +**Q: "What's the biggest risk right now?"** +A: "60% of users never find the feature. This is fixable via icon redesign." + +**Q: "When will we know if this was worth building?"** +A: "Week 4 A/B test readout (Day 28) will give us causal impact on retention. +Early signals are positive." + +**Q: "How does this compare to [other feature launch]?"** +A: "Adoption 15% ahead of forecast vs. [feature X] which was 20% below. Engagement +depth similar. Stronger start than average." + +## Deliverable Checklist + +- [ ] Reviewed findings from Days 8-11 +- [ ] Synthesized key insights into 3 categories (Good/Bad/Insights) +- [ ] Drafted TL;DR summary (1 sentence) +- [ ] Formulated 2-3 clear, actionable recommendations with owners and timelines +- [ ] Created one compelling visualization +- [ ] Wrote in clear, decisive language (no hedging) +- [ ] Acknowledged uncertainties appropriately +- [ ] Linked to detailed analysis notebooks in appendix +- [ ] Kept total length to 2 pages or less +- [ ] Had a colleague review for clarity + +## Key Takeaways + +1. **Bottom line first:** Executives read top-to-bottom until they get bored +2. **Be decisive:** Weak recommendations waste everyone's time +3. **One great chart > ten mediocre ones:** Visual clarity matters +4. **Honesty builds trust:** Don't hide problems or oversell correlation as causation +5. **Action-oriented:** Every section should drive a decision or next step + +--- + +**Remember:** This memo is not a trophy for all your hard work this week—it's a tool for decision-making. Your job is to distill complexity into clarity, guide the team toward the highest-impact actions, and build trust through intellectual honesty. Write for your reader, not for yourself. diff --git a/book-src/src/week-2/day-13.md b/book-src/src/week-2/day-13.md index a94175a..1e5b9d3 100644 --- a/book-src/src/week-2/day-13.md +++ b/book-src/src/week-2/day-13.md @@ -1,3 +1,558 @@ # Day 13: The Early A/B Test Readout – Resisting Pressure -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To analyze preliminary A/B test data while masterfully managing stakeholder expectations and preventing premature decision-making. + +**Why This Matters:** The single fastest way to lose credibility as an analyst is to endorse a decision based on noisy, statistically insignificant early data. Your role is to be the voice of statistical integrity. + +## The Pressure Cooker + +**The Scenario:** + +It's Day 7 of your 28-day A/B test. You're in a meeting with the Product Manager who championed the Journals feature. They're under pressure from their VP to show early wins. + +**PM:** "Hey, it's been a week. How's the A/B test looking? Are we winning? Can we ship this to 100%?" + +**You (internally):** *The data is noisy. We designed this for 28 days. The p-value is 0.25. The confidence interval includes zero. This is exactly the scenario where bad decisions get made.* + +**Your challenge:** Provide transparency while preventing a premature, statistically unjustified decision. + +## The Statistical Integrity Framework + +### The Three Pillars of Responsible Test Readouts + +1. **Statistical Significance:** Did we reach the threshold we agreed on? (Usually p < 0.05) +2. **Practical Significance:** Is the effect size large enough to matter for the business? +3. **Temporal Stability:** Is this result likely to hold, or is it a novelty effect? + +**All three must be satisfied** before making a ship decision. + +## Task 1: Run the Preliminary Analysis + +Using the first 7 days of data, calculate metrics and statistical tests. + +### Day-7 Retention as Proxy Metric + +```sql +-- Early A/B Test Analysis: Day-7 Retention +WITH test_assignments AS ( + -- Get user assignments to control/treatment + SELECT DISTINCT + user_id, + experiment_group, -- 'control' or 'treatment' + assignment_timestamp + FROM ab_test_assignments + WHERE experiment_name = 'journals_launch_test' + AND assignment_timestamp >= CURRENT_DATE - INTERVAL '7 days' +), +user_activity AS ( + -- Track if users were active on Day 7 post-assignment + SELECT + ta.user_id, + ta.experiment_group, + ta.assignment_timestamp, + MAX(CASE + WHEN e.event_timestamp >= ta.assignment_timestamp + INTERVAL '7 days' + AND e.event_timestamp < ta.assignment_timestamp + INTERVAL '8 days' + THEN 1 ELSE 0 + END) AS active_day_7 + FROM test_assignments ta + LEFT JOIN events e ON ta.user_id = e.user_id + WHERE e.event_timestamp >= ta.assignment_timestamp + AND e.event_timestamp < ta.assignment_timestamp + INTERVAL '8 days' + GROUP BY ta.user_id, ta.experiment_group, ta.assignment_timestamp +) +SELECT + experiment_group, + COUNT(*) AS total_users, + SUM(active_day_7) AS retained_users, + ROUND(100.0 * SUM(active_day_7) / COUNT(*), 3) AS retention_rate_pct +FROM user_activity +GROUP BY experiment_group +ORDER BY experiment_group; +``` + +### Calculate Statistical Significance + +```python +import numpy as np +from scipy import stats +import pandas as pd + +def calculate_ab_test_statistics(control_conversions, control_total, + treatment_conversions, treatment_total, + alpha=0.05): + """ + Calculate comprehensive A/B test statistics + + Parameters: + ----------- + control_conversions : int + Number of successes in control group + control_total : int + Total users in control group + treatment_conversions : int + Number of successes in treatment group + treatment_total : int + Total users in treatment group + alpha : float + Significance level (default 0.05) + + Returns: + -------- + dict : Statistical test results + """ + # Calculate rates + p_control = control_conversions / control_total + p_treatment = treatment_conversions / treatment_total + + # Absolute lift + absolute_lift = p_treatment - p_control + + # Relative lift + relative_lift = (p_treatment - p_control) / p_control if p_control > 0 else 0 + + # Pooled standard error for two-proportion z-test + p_pooled = (control_conversions + treatment_conversions) / (control_total + treatment_total) + se_pooled = np.sqrt(p_pooled * (1 - p_pooled) * (1/control_total + 1/treatment_total)) + + # Z-statistic + z_stat = (p_treatment - p_control) / se_pooled if se_pooled > 0 else 0 + + # P-value (two-tailed) + p_value = 2 * (1 - stats.norm.cdf(abs(z_stat))) + + # Confidence interval for the difference + se_diff = np.sqrt((p_control * (1 - p_control) / control_total) + + (p_treatment * (1 - p_treatment) / treatment_total)) + + z_critical = stats.norm.ppf(1 - alpha/2) + ci_lower = absolute_lift - z_critical * se_diff + ci_upper = absolute_lift + z_critical * se_diff + + # Relative CI + ci_lower_rel = ci_lower / p_control if p_control > 0 else 0 + ci_upper_rel = ci_upper / p_control if p_control > 0 else 0 + + # Statistical power (post-hoc) + effect_size = (p_treatment - p_control) / np.sqrt(p_pooled * (1 - p_pooled)) + n_harmonic = 2 / (1/control_total + 1/treatment_total) + power = stats.norm.cdf(abs(effect_size) * np.sqrt(n_harmonic/2) - z_critical) + + return { + 'control_rate': p_control, + 'treatment_rate': p_treatment, + 'absolute_lift': absolute_lift, + 'relative_lift': relative_lift, + 'ci_95_lower': ci_lower, + 'ci_95_upper': ci_upper, + 'ci_95_lower_rel': ci_lower_rel, + 'ci_95_upper_rel': ci_upper_rel, + 'z_statistic': z_stat, + 'p_value': p_value, + 'is_significant': p_value < alpha, + 'statistical_power': power, + 'sample_size_control': control_total, + 'sample_size_treatment': treatment_total + } + +# Example usage +results = calculate_ab_test_statistics( + control_conversions=2140, # 21.4% retention + control_total=10000, + treatment_conversions=2290, # 22.9% retention + treatment_total=10000 +) + +print(f"Treatment Retention: {results['treatment_rate']:.3%}") +print(f"Control Retention: {results['control_rate']:.3%}") +print(f"Relative Lift: {results['relative_lift']:.3%}") +print(f"95% CI: [{results['ci_95_lower_rel']:.3%}, {results['ci_95_upper_rel']:.3%}]") +print(f"P-value: {results['p_value']:.4f}") +print(f"Statistically Significant: {results['is_significant']}") +``` + +### Check Guardrail Metrics + +```sql +-- Guardrail Metrics Check: Ensure No Negative Side Effects +WITH test_users AS ( + SELECT + user_id, + experiment_group + FROM ab_test_assignments + WHERE experiment_name = 'journals_launch_test' +), +guardrail_metrics AS ( + SELECT + tu.experiment_group, + -- Time on Feed (cannibalization check) + AVG(CASE WHEN e.event_name = 'feed_session' THEN + EXTRACT(EPOCH FROM (e.session_end_time - e.session_start_time)) + END) AS avg_feed_time_seconds, + -- App Uninstalls (user frustration check) + COUNT(DISTINCT CASE WHEN e.event_name = 'app_uninstall' THEN e.user_id END) AS uninstalls, + COUNT(DISTINCT tu.user_id) AS total_users + FROM test_users tu + LEFT JOIN events e ON tu.user_id = e.user_id + WHERE e.event_timestamp >= (SELECT MIN(assignment_timestamp) FROM ab_test_assignments + WHERE experiment_name = 'journals_launch_test') + GROUP BY tu.experiment_group +) +SELECT + experiment_group, + total_users, + ROUND(avg_feed_time_seconds, 1) AS avg_feed_time_sec, + uninstalls, + ROUND(100.0 * uninstalls / total_users, 3) AS uninstall_rate_pct +FROM guardrail_metrics +ORDER BY experiment_group; +``` + +## Task 2: Draft the Update Memo + +Create file `AB_Test_Week1_Update.md` that provides data while reinforcing statistical discipline. + +### The Responsible Early Readout Template + +```markdown +# A/B Test Early Update: Journals Feature (Week 1 of 4) + +**Test Name:** journals_launch_test +**Date:** Day 7 of 28 +**Analyst:** [Your Name] +**Status:** 🟡 IN PROGRESS - NOT READY FOR DECISION + +--- + +## ⚠️ CRITICAL CAVEAT: THIS IS PRELIMINARY DATA ONLY + +**This readout is for monitoring purposes only and should NOT be used to make +shipping decisions. Per our experimental design, we committed to a 28-day test +duration before making a go/no-go decision.** + +--- + +## Preliminary Results (Day 7 Only) + +### Primary Metric: Day-7 Retention + +| Group | Sample Size | Retained Users | Retention Rate | +|-------|-------------|----------------|----------------| +| Control | 10,000 | 2,140 | 21.4% | +| Treatment | 10,000 | 2,290 | 22.9% | + +**Observed Lift:** +1.5 percentage points (+7.0% relative lift) + +### Statistical Analysis + +- **95% Confidence Interval:** [-0.5%, +3.5%] (relative lift) +- **P-value:** 0.252 +- **Statistical Significance:** ❌ **NOT SIGNIFICANT** (p > 0.05) + +--- + +## What This Means (And Doesn't Mean) + +### ❌ What We CANNOT Conclude: + +1. **We cannot say the feature "works"** - The result is not statistically significant +2. **We cannot ship based on this data** - The confidence interval includes zero (no effect) +3. **We cannot claim a 7% lift is real** - This could easily be random noise + +### ✅ What We CAN Say: + +1. **The feature is not obviously harmful** - No significant negative signals +2. **Early trend is in the right direction** - Numerically positive (but not proven) +3. **The test is progressing as planned** - Sample sizes on track, no data quality issues + +--- + +## Mandatory Analyst Caveats + +### 1. Statistical Significance + +**The result is NOT statistically significant.** + +With a p-value of 0.252, there is a **25% chance** this observed difference could +occur purely by random chance, even if there is zero true effect. This is **5x higher** +than our 5% threshold for decision-making. + +**Translation:** We have insufficient evidence to claim this is a real effect. + +### 2. Novelty Effect Bias + +**Early lift is often inflated by user curiosity.** + +Day-7 retention measures short-term curiosity ("What is this new thing?"), not +long-term habit formation ("I can't live without this feature"). Historical data +shows that: + +- **50% of features** with positive Day-7 signals show neutral or negative Day-28 results +- **Novelty effects decay** after the first week as curiosity wanes +- **Habit formation takes time** to manifest in retention metrics + +**Translation:** Even if this lift were significant, it might not hold over 28 days. + +### 3. Confidence Interval Includes Zero + +**The 95% CI is [-0.5%, +3.5%] relative lift.** + +This means we are 95% confident the true effect lies somewhere in this range. +Critically, this range **includes negative values** (feature could hurt retention) +and **includes zero** (feature could have no effect). + +**Translation:** The data is consistent with many possible realities, including +"the feature does nothing." + +### 4. Insufficient Statistical Power + +**With only 7 days of data, our power is ~45%.** + +We designed this test for 80% power at 28 days. At Day 7, we have only **45% power**, +meaning we have a **55% chance of missing a real 2% effect** even if it exists. + +**Translation:** Lack of significance now doesn't mean lack of effect; we haven't +collected enough data yet. + +--- + +## Guardrail Metrics (Early Check) + +| Metric | Control | Treatment | Difference | Status | +|--------|---------|-----------|------------|--------| +| Avg Time on Feed | 847 sec | 842 sec | -0.6% | ✅ No concern | +| Uninstall Rate | 0.23% | 0.21% | -0.02pp | ✅ No concern | + +**Assessment:** No early warning signs of negative side effects. + +--- + +## Recommendation + +### DO NOT make a shipping decision based on this data. + +**Rationale:** +1. Result is not statistically significant (p = 0.25) +2. Confidence interval too wide to be actionable +3. High risk of novelty effect bias +4. Test was explicitly designed for 28-day duration + +### Continue monitoring the test through Day 28. + +**Next Checkpoints:** +- **Day 14 (Informal):** Quick health check, no decision expected +- **Day 28 (Formal):** Final readout with full statistical analysis + +**Decision Criteria (Unchanged):** +- Primary metric must show p < 0.05 significance +- Confidence interval must exclude zero +- No significant negative movement in guardrail metrics +- Observed effect must be practically meaningful (>5% relative lift preferred) + +--- + +## If Asked: "Can We Make an Exception and Ship Early?" + +**No. Here's why:** + +1. **Credibility:** If we ship without significance, what stops us from doing it again? + Statistical discipline is either a standard or it's not. + +2. **Risk:** There's a 25% chance (p=0.25) this observed lift is pure noise. Would you + ship a feature with a 1-in-4 chance of doing nothing? + +3. **Precedent:** Early shipping based on promising-but-insignificant data is how + organizations build features that don't actually work. + +4. **Alternative:** If speed is critical, we can reduce the test to 14 days, but only + if we simultaneously reduce our power expectations and accept higher risk of Type II + error. + +--- + +## Appendix: Technical Details + +**Sample Size Achieved:** 10,000 per group (target: 15,000 per group at Day 28) +**Data Quality:** No SRM violations (Sample Ratio Mismatch), assignment balanced +**Instrumentation:** All events firing correctly +**Analysis Method:** Two-proportion z-test (standard for binary outcomes) + +**Full Analysis Notebook:** `13_ab_test_week1_analysis.ipynb` + +--- + +**Next Update:** Day 14 (informal health check) or immediately if critical anomaly detected. +``` + +## The Art of Saying "No" to Stakeholders + +### Pressure Scenarios & Responses + +**Scenario 1: "The VP wants good news for the board meeting tomorrow"** + +**Bad Response:** "Okay, I guess we can say it's trending positive..." + +**Good Response:** +"I understand the timing pressure. Here's what I can honestly say: 'Early signals +are directionally positive but not yet statistically significant. Full results in +3 weeks.' Shipping now based on p=0.25 data would set a dangerous precedent and +risk building a feature that doesn't actually work." + +--- + +**Scenario 2: "But you said there was a lift!"** + +**Bad Response:** "Well, technically yes, but..." + +**Good Response:** +"There's an *observed* lift of 7%, but there's also a 25% chance that lift is +pure random noise. Would you ship a feature with a 1-in-4 chance of doing nothing? +Our standard is 95% confidence for good reason—it protects us from expensive mistakes." + +--- + +**Scenario 3: "Can't we just ship to 10% of users as a compromise?"** + +**Bad Response:** "Sure, that sounds safe." + +**Good Response:** +"Partial rollouts can make sense for risk mitigation, but they don't solve the +statistical problem. We'd still be making a decision without sufficient evidence. +Let's stick to the plan and ship with confidence at Day 28, or openly decide to +change our decision criteria (which I'd document as increased risk acceptance)." + +## Visualization: Showing Uncertainty Honestly + +```python +import matplotlib.pyplot as plt +import numpy as np + +def visualize_early_ab_test_results(control_rate, treatment_rate, ci_lower, ci_upper): + """ + Create a visualization that emphasizes uncertainty + """ + fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5)) + + # Chart 1: Point Estimate with Confidence Interval + groups = ['Control', 'Treatment'] + rates = [control_rate * 100, treatment_rate * 100] + colors = ['#1976D2', '#388E3C'] + + bars = ax1.bar(groups, rates, color=colors, alpha=0.7, edgecolor='black', linewidth=2) + + # Add error bars for treatment (showing uncertainty) + ax1.errorbar(1, treatment_rate * 100, + yerr=[[treatment_rate*100 - ci_lower*100], + [ci_upper*100 - treatment_rate*100]], + fmt='none', ecolor='black', capsize=10, capthick=2, linewidth=2) + + # Add significance annotation + ax1.text(0.5, max(rates) + 2, 'NOT SIGNIFICANT\n(p = 0.25)', + ha='center', fontsize=12, fontweight='bold', + color='#D32F2F', bbox=dict(boxstyle='round', facecolor='#FFCDD2')) + + ax1.set_ylabel('Retention Rate (%)', fontsize=12, fontweight='bold') + ax1.set_title('Day-7 Retention (Preliminary)', fontsize=13, fontweight='bold') + ax1.set_ylim(0, max(rates) + 5) + + # Add value labels + for i, bar in enumerate(bars): + height = bar.get_height() + ax1.text(bar.get_x() + bar.get_width()/2., height, + f'{height:.1f}%', + ha='center', va='bottom', fontsize=11, fontweight='bold') + + # Chart 2: Confidence Interval Visualization + relative_lifts = np.linspace(ci_lower * 100, ci_upper * 100, 100) + likelihood = stats.norm.pdf(relative_lifts, loc=(treatment_rate - control_rate)*100, + scale=(ci_upper - ci_lower)*25) + + ax2.fill_between(relative_lifts, likelihood, alpha=0.3, color='#1976D2') + ax2.plot(relative_lifts, likelihood, color='#1976D2', linewidth=2) + + # Highlight zero + ax2.axvline(x=0, color='#D32F2F', linestyle='--', linewidth=2, + label='Zero Effect (No Difference)') + + # Show observed lift + ax2.axvline(x=(treatment_rate - control_rate)*100, color='#388E3C', + linestyle='-', linewidth=2, label='Observed Lift') + + ax2.set_xlabel('Possible True Lift (%)', fontsize=12, fontweight='bold') + ax2.set_ylabel('Likelihood', fontsize=12, fontweight='bold') + ax2.set_title('95% Confidence Interval (Includes Zero)', fontsize=13, fontweight='bold') + ax2.legend(loc='upper right') + ax2.grid(True, alpha=0.3, axis='x') + + # Annotate CI range + ax2.annotate(f'95% CI\n[{ci_lower*100:.1f}%, {ci_upper*100:.1f}%]', + xy=(0, max(likelihood)*0.5), fontsize=10, + bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5)) + + plt.tight_layout() + plt.savefig('ab_test_week1_uncertainty.png', dpi=300, bbox_inches='tight') + return fig +``` + +## Advanced Topic: Sequential Testing + +For situations where early stopping is necessary: + +```python +def calculate_sequential_boundary(alpha=0.05, looks=4): + """ + Calculate O'Brien-Fleming spending function for sequential testing + + This allows for "peeking" at results while controlling Type I error + """ + from scipy.stats import norm + + # Information fractions (how much data at each look) + info_fractions = np.array([0.25, 0.50, 0.75, 1.0]) + + # O'Brien-Fleming alpha spending + spending = 2 * (1 - norm.cdf(norm.ppf(1 - alpha/2) / np.sqrt(info_fractions))) + + # Critical z-values + z_boundaries = norm.ppf(1 - spending/2) + + return pd.DataFrame({ + 'look': range(1, looks+1), + 'day': [7, 14, 21, 28], + 'info_fraction': info_fractions, + 'alpha_spent': spending, + 'z_critical': z_boundaries, + 'p_critical': spending + }) + +# Usage: Only claim significance if p-value < p_critical for that look +seq_boundaries = calculate_sequential_boundary() +print(seq_boundaries) +``` + +## Deliverable Checklist + +- [ ] `13_ab_test_week1_analysis.ipynb` notebook created +- [ ] Primary metric calculated (Day-7 retention) +- [ ] Statistical significance test performed +- [ ] 95% confidence interval calculated +- [ ] Guardrail metrics checked +- [ ] `AB_Test_Week1_Update.md` memo drafted +- [ ] Caveats section completed with all 4 key points +- [ ] Visualization showing uncertainty created +- [ ] Recommendation clearly states "Do not ship" +- [ ] Prepared responses to stakeholder pressure documented + +## Key Takeaways + +1. **P-values are not suggestions:** p < 0.05 is the threshold; p = 0.25 is not "close enough" +2. **Novelty effects are real:** Early lifts often don't hold; Day 7 ≠ Day 28 +3. **Confidence intervals matter:** If CI includes zero, you haven't proven anything +4. **Statistical discipline is credibility:** Saying "no" to bad decisions builds trust +5. **Document everything:** Future you (and future stakeholders) will thank you + +--- + +**Remember:** Your job is not to give stakeholders the answer they want—it's to give them the answer the data supports. In this moment, when a PM is pressuring you for good news, your statistical integrity is being tested. The analyst who holds the line and says "We need to wait" is the analyst who earns trust and prevents expensive mistakes. Be that analyst. diff --git a/book-src/src/week-2/day-14.md b/book-src/src/week-2/day-14.md index 8a4249a..1f0ed23 100644 --- a/book-src/src/week-2/day-14.md +++ b/book-src/src/week-2/day-14.md @@ -1,3 +1,605 @@ # Day 14: Weaving the Narrative – Quant + Qual -_Summary and tasks as per curriculum. Add your notes and findings here._ +## Overview + +**Objective:** To combine quantitative metrics with qualitative user feedback to create a holistic and deeply empathetic understanding of the user experience. + +**Why This Matters:** Numbers tell you *what* users do; words tell you *how they feel*. The most powerful insights lie at the intersection of both. This skill separates a data reporter from a true product strategist. + +## The Power of Integration + +### What Quantitative Data Tells You + +- **What happened:** "60% of users dropped off at Step 3" +- **How many:** "27,000 users affected" +- **Statistical significance:** "p < 0.001, highly significant" +- **Patterns:** "iOS users convert 15% better than Android" + +### What Qualitative Data Tells You + +- **Why it happened:** "Users didn't see the icon because..." +- **How users feel:** "Frustrated," "Delighted," "Confused" +- **Unexpected insights:** "I use it for grief journaling, not what you designed for" +- **Language users use:** "Private space," "My thoughts," "Safe place" + +### The Magic of Synthesis + +When combined, quantitative + qualitative answers: +- **What's broken AND why it's broken** +- **What's working AND why users love it** +- **What to build next AND how to describe it** + +## The Scenario + +You have two data sources: + +1. **Quantitative (from Day 10):** 60% funnel drop-off between viewing feed and tapping icon +2. **Qualitative:** 500 user comments about the Journals feature from in-app feedback and app store reviews + +Your job: Find the story that connects them. + +## Task 1: Load and Categorize Qualitative Data + +### Simple NLP-Based Categorization + +```python +import pandas as pd +import re +from collections import Counter + +def load_and_clean_feedback(filepath='feedback.csv'): + """ + Load user feedback and perform basic cleaning + """ + df = pd.read_csv(filepath) + + # Clean text + df['comment_clean'] = df['comment'].str.lower() + df['comment_clean'] = df['comment_clean'].str.replace(r'[^\w\s]', '', regex=True) + + return df + +def categorize_feedback(comment): + """ + Categorize a comment into predefined themes + + Returns: list of applicable categories + """ + comment = comment.lower() + categories = [] + + # Define keyword patterns for each category + bug_keywords = ['crash', 'broken', 'error', 'bug', 'doesnt work', "doesn't work", + 'not working', 'freeze', 'slow'] + feature_request_keywords = ['wish', 'would be nice', 'please add', 'need', 'want', + 'should have', 'missing', 'add feature'] + praise_keywords = ['love', 'amazing', 'great', 'awesome', 'fantastic', 'excellent', + 'thank you', 'finally', 'perfect'] + privacy_keywords = ['privacy', 'private', 'secure', 'security', 'safe', 'encryption', + 'data', 'share', 'access'] + discovery_keywords = ['find', 'found', 'discover', 'hidden', 'didnt know', "didn't know", + 'where is', 'how to', 'cant find', "can't find"] + + if any(word in comment for word in bug_keywords): + categories.append('Bug Report') + if any(word in comment for word in feature_request_keywords): + categories.append('Feature Request') + if any(word in comment for word in praise_keywords): + categories.append('Praise') + if any(word in comment for word in privacy_keywords): + categories.append('Privacy Concern') + if any(word in comment for word in discovery_keywords): + categories.append('Discovery Issue') + + # Default to "Other" if no categories matched + if not categories: + categories.append('Other') + + return categories + +def analyze_feedback_themes(df): + """ + Analyze feedback and return theme counts + """ + all_categories = [] + + for comment in df['comment_clean']: + categories = categorize_feedback(comment) + all_categories.extend(categories) + + theme_counts = Counter(all_categories) + + return pd.DataFrame({ + 'theme': list(theme_counts.keys()), + 'count': list(theme_counts.values()) + }).sort_values('count', ascending=False) + +# Usage +df_feedback = load_and_clean_feedback('feedback.csv') +theme_summary = analyze_feedback_themes(df_feedback) +print(theme_summary) +``` + +### Advanced: Sentiment Analysis + +```python +from textblob import TextBlob + +def analyze_sentiment(comment): + """ + Perform sentiment analysis on a comment + + Returns: sentiment score (-1 to 1) and classification + """ + blob = TextBlob(comment) + sentiment_score = blob.sentiment.polarity + + if sentiment_score > 0.1: + sentiment_class = 'Positive' + elif sentiment_score < -0.1: + sentiment_class = 'Negative' + else: + sentiment_class = 'Neutral' + + return sentiment_score, sentiment_class + +def add_sentiment_to_feedback(df): + """ + Add sentiment columns to feedback dataframe + """ + df['sentiment_score'], df['sentiment_class'] = zip(*df['comment'].apply(analyze_sentiment)) + return df + +# Analyze sentiment by theme +df_feedback = add_sentiment_to_feedback(df_feedback) + +sentiment_by_theme = df_feedback.groupby('primary_theme').agg({ + 'sentiment_score': 'mean', + 'comment': 'count' +}).rename(columns={'comment': 'count'}) + +print(sentiment_by_theme) +``` + +## Task 2: Quantify Themes + +Create a visualization showing the distribution of feedback themes. + +```python +import matplotlib.pyplot as plt +import seaborn as sns + +def visualize_feedback_themes(theme_summary): + """ + Create a bar chart of feedback themes + """ + fig, ax = plt.subplots(figsize=(12, 6)) + + # Sort by count + theme_summary = theme_summary.sort_values('count', ascending=True) + + # Define colors for each theme + color_map = { + 'Praise': '#4CAF50', + 'Feature Request': '#2196F3', + 'Bug Report': '#F44336', + 'Privacy Concern': '#FF9800', + 'Discovery Issue': '#9C27B0', + 'Other': '#9E9E9E' + } + + colors = [color_map.get(theme, '#9E9E9E') for theme in theme_summary['theme']] + + bars = ax.barh(theme_summary['theme'], theme_summary['count'], color=colors, alpha=0.8) + + # Add value labels + for i, bar in enumerate(bars): + width = bar.get_width() + ax.text(width, bar.get_y() + bar.get_height()/2, + f' {int(width)} ({int(width)/theme_summary["count"].sum()*100:.1f}%)', + ha='left', va='center', fontsize=10, fontweight='bold') + + ax.set_xlabel('Number of Comments', fontsize=12, fontweight='bold') + ax.set_title('User Feedback Themes: Journals Feature (Week 1)\nn=500 comments', + fontsize=14, fontweight='bold') + ax.grid(axis='x', alpha=0.3) + + plt.tight_layout() + plt.savefig('feedback_themes.png', dpi=300, bbox_inches='tight') + return fig + +# Create visualization +visualize_feedback_themes(theme_summary) +``` + +### Extract Representative Quotes + +```python +def extract_representative_quotes(df, theme, n=5, min_length=50): + """ + Extract the most representative quotes for a given theme + + Uses sentiment score and length as proxy for quality + """ + # Filter to theme + theme_comments = df[df['comment_clean'].apply( + lambda x: theme.lower() in ' '.join(categorize_feedback(x)).lower() + )].copy() + + # Add comment length + theme_comments['length'] = theme_comments['comment'].str.len() + + # Filter by minimum length + theme_comments = theme_comments[theme_comments['length'] >= min_length] + + # Sort by absolute sentiment (strongest opinions) + theme_comments['abs_sentiment'] = theme_comments['sentiment_score'].abs() + theme_comments = theme_comments.sort_values('abs_sentiment', ascending=False) + + return theme_comments[['comment', 'sentiment_score', 'sentiment_class']].head(n) + +# Example: Get representative "Discovery Issue" quotes +discovery_quotes = extract_representative_quotes(df_feedback, 'Discovery Issue') +for idx, row in discovery_quotes.iterrows(): + print(f"[{row['sentiment_class']}] {row['comment']}\n") +``` + +## Task 3: Find the Connection + +Look for patterns where qualitative data explains, contradicts, or adds nuance to quantitative findings. + +### The Synthesis Framework + +**Step 1: State the Quantitative Finding** +"Our funnel analysis showed a 60% drop-off between viewing the main feed and tapping the Journals icon." + +**Step 2: Ask "Why?" and Look to Qualitative** +Search feedback for keywords: "find," "discover," "didn't know," "where is" + +**Step 3: Find Supporting Evidence** +Extract quotes that explain the quantitative pattern + +**Step 4: Synthesize into Insight** +Combine both into a unified narrative + +## Task 4: Write the Synthesized Insight + +In notebook `14_qualitative_analysis.ipynb`, create the synthesis. + +### The Synthesis Template + +```markdown +# Qualitative-Quantitative Synthesis: The Discovery Problem + +## Executive Summary + +Our quantitative funnel analysis revealed a critical 60% drop-off point. Our +qualitative feedback analysis explains *why*: users want the feature but +literally can't find it. This is not a value problem—it's a design problem. + +--- + +## The Quantitative Signal + +**Finding (from Day 10 Funnel Analysis):** +- 60% of users (27,000 per week) drop off between viewing the main feed and + tapping the Journals icon +- This represents the single largest friction point in our adoption journey +- Users who DO find the icon convert at 67% (strong value signal) + +**What This Told Us:** +Something is preventing the majority of interested users from taking the next step. +But the data couldn't tell us *what* or *why*. + +--- + +## The Qualitative Context + +**Finding (from Week 1 User Feedback Analysis):** + +**Theme Distribution (n=500 comments):** +- Praise: 28% (140 comments) +- Discovery Issues: 22% (110 comments) ⚠️ +- Feature Requests: 20% (100 comments) +- Privacy Concerns: 15% (75 comments) +- Bug Reports: 10% (50 comments) +- Other: 5% (25 comments) + +**Key Insight:** "Discovery Issues" is the 2nd largest feedback category, despite +the feature being only 1 week old. + +--- + +## The Synthesis: What the Words Reveal + +### Representative Quotes (Discovery Issues Theme) + +**1. The "I Finally Found It" Pattern** + +> "I love this feature but it took me 3 days to even find it! The icon blends +> into the background. Please make it more visible!" +> — User A, iOS, Sentiment: Positive + +> "This is exactly what I've been wanting, but I only discovered it because my +> friend told me about it. I would have never found it on my own." +> — User B, Android, Sentiment: Neutral + +**What This Tells Us:** Users who find the feature love it (validates value), but +many rely on external discovery (friends, social media, luck) rather than in-app UX. + +--- + +**2. The "I Wish I Knew Sooner" Pattern** + +> "I've been using this app for 2 years and just found out about Journals today. +> Why wasn't this more prominent??" +> — User C, iOS, Sentiment: Frustrated + +> "Love it now that I found it, but wish I'd known about it from day 1. Feels +> hidden." +> — User D, Android, Sentiment: Positive (but frustrated) + +**What This Tells Us:** The feature's value proposition is strong (users use words +like "love," "exactly what I needed"), but the pain point is discoverability, not +desirability. + +--- + +**3. The "Where Is It?" Pattern** + +> "I saw a screenshot on Twitter but can't figure out how to access this feature. +> Is it only for premium users?" +> — User E, iOS, Sentiment: Confused + +> "Spent 10 minutes looking for the 'journal' feature everyone's talking about. +> Finally found the tiny icon. Make it bigger!" +> — User F, Android, Sentiment: Negative + +**What This Tells Us:** Users are actively searching for the feature (high intent) +but failing to locate it through normal app navigation. + +--- + +## The Integrated Insight + +### What Quantitative Data Alone Would Have Told Us: +"There's a 60% drop-off in the funnel at Step 3. Users aren't tapping the icon." + +**Possible Interpretations:** +- Users don't want the feature (value problem) ❌ +- The icon is confusing (comprehension problem) ❌ +- Users can't see the icon (visibility problem) ✅ + +### What Qualitative Data Adds: +"Of the users who DO find the feature, 80%+ express phrases like 'I love this,' +'exactly what I needed,' or 'wish I found it sooner.'" + +**This rules out the first two interpretations.** Users want the feature. The +icon makes sense once clicked. The problem is **visibility**. + +--- + +## The Unified Story + +**The Complete Narrative:** + +Our quantitative funnel analysis identified a 60% drop-off between viewing the +feed and tapping the Journals icon—a massive leak affecting 27,000 users per week. + +**Qualitative feedback reveals the "why" behind this number:** The feature has +strong product-market fit (28% of comments are praise), but a critical UX failure +is hiding it from users who actively want it. + +**Evidence:** +- **22% of all feedback** is about discovery problems (2nd largest category) +- **80%+ of "Discovery Issue" comments** contain positive sentiment about the + feature itself +- Users use phrases like "finally found," "wish I knew sooner," and "why is this + hidden?" + +**This isn't a feature failure—it's a design failure.** The icon's current +placement, size, and color are failing to capture user attention in the busy +feed environment. + +--- + +## The Power of This Synthesis + +### What We Now Know (That We Couldn't Know from Either Data Source Alone): + +1. **The feature has strong PMF** (qualitative: "love," "exactly what I needed") +2. **The bottleneck is discoverability** (quantitative: 60% funnel drop) +3. **Users WANT to find it** (qualitative: "spent 10 minutes looking") +4. **The icon design is the problem** (synthesis of both) + +### What This Means for Product Strategy: + +**Priority 1:** Fix icon visibility (high confidence, high impact) +- Not a feature problem, so no roadmap change needed +- Not a value problem, so messaging is working +- Pure UX fix: make it bigger, brighter, more prominent + +**Expected Impact:** +- If icon redesign improves tap-through rate from 40% to 80% (plausible based on + benchmark data), we add 18,000 adopters per week +- Qualitative feedback suggests this won't cannibalize feed time (users see them + as complementary, not competitive) + +--- + +## Quantitative-Qualitative Cross-Validation + +### Finding: Praise Sentiment Despite Low Discovery + +**Quantitative:** Only 36% of users tap the icon (low discovery rate) + +**Qualitative:** 28% of comments are praise (highest category) + +**Synthesis:** The small subset of users who discover the feature become passionate +advocates. This is a strong signal that solving the discovery problem will unlock +significant value. + +### Finding: No Privacy Backlash (Guardrail Check) + +**Quantitative:** Uninstall rate flat at 0.23% (no spike) + +**Qualitative:** Privacy concerns represent only 15% of feedback (and are mostly +questions, not complaints) + +**Synthesis:** Our privacy messaging and data handling are adequate. No need to +deprioritize feature work to address privacy concerns. + +--- + +## Conclusion + +This analysis demonstrates the irreplaceable value of combining quantitative and +qualitative data: + +- **Quantitative** showed us WHERE the problem is (60% funnel drop at Step 3) +- **Qualitative** showed us WHY it's happening (users can't find the icon) +- **Synthesis** showed us WHAT TO DO (fix icon visibility, not feature value) + +**Without qualitative:** We might have concluded users don't want the feature and +killed it. + +**Without quantitative:** We might have missed the scale of the problem (27K users +affected). + +**Together:** We have a clear, actionable, high-confidence recommendation that +addresses root cause, not symptoms. + +--- + +## Recommended Action + +**Design an A/B test for icon redesign** (already prioritized in Week 1 memo) with +confidence that: +1. The feature has proven value (qualitative validation) +2. The bottleneck is specific and fixable (quantitative isolation) +3. Users actively want better access (qualitative demand signal) + +**This is the power of synthesis.** +``` + +## Advanced Technique: Topic Modeling + +For larger datasets, use machine learning to discover themes: + +```python +from sklearn.feature_extraction.text import TfidfVectorizer +from sklearn.decomposition import LatentDirichletAllocation + +def perform_topic_modeling(comments, n_topics=5, n_words=10): + """ + Use LDA to discover hidden topics in feedback + """ + # Vectorize comments + vectorizer = TfidfVectorizer(max_features=1000, stop_words='english') + doc_term_matrix = vectorizer.fit_transform(comments) + + # Fit LDA model + lda = LatentDirichletAllocation(n_components=n_topics, random_state=42) + lda.fit(doc_term_matrix) + + # Extract top words for each topic + feature_names = vectorizer.get_feature_names_out() + topics = [] + + for topic_idx, topic in enumerate(lda.components_): + top_words_idx = topic.argsort()[-n_words:][::-1] + top_words = [feature_names[i] for i in top_words_idx] + topics.append({ + 'topic_num': topic_idx + 1, + 'top_words': top_words + }) + + return pd.DataFrame(topics) + +# Usage +topics_df = perform_topic_modeling(df_feedback['comment_clean'], n_topics=5) +print(topics_df) +``` + +## Visualization: The Synthesis Dashboard + +```python +def create_synthesis_dashboard(funnel_data, theme_data, quotes): + """ + Create a unified dashboard showing quant + qual insights + """ + fig = plt.figure(figsize=(16, 10)) + gs = fig.add_gridspec(3, 2, hspace=0.3, wspace=0.3) + + # Top: Funnel (quantitative) + ax1 = fig.add_subplot(gs[0, :]) + # ... funnel chart code from Day 10 ... + + # Middle Left: Theme breakdown (qualitative) + ax2 = fig.add_subplot(gs[1, 0]) + # ... theme chart code ... + + # Middle Right: Sentiment by theme + ax3 = fig.add_subplot(gs[1, 1]) + # ... sentiment chart code ... + + # Bottom: Quote callouts + ax4 = fig.add_subplot(gs[2, :]) + ax4.axis('off') + + # Add representative quotes + quote_text = "Representative User Voices:\n\n" + for i, quote in enumerate(quotes[:3], 1): + quote_text += f'{i}. "{quote}"\n\n' + + ax4.text(0.05, 0.95, quote_text, transform=ax4.transAxes, + fontsize=10, verticalalignment='top', style='italic', + bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5)) + + plt.suptitle('The Discoverability Problem: Quantitative + Qualitative Evidence', + fontsize=16, fontweight='bold') + + plt.savefig('synthesis_dashboard.png', dpi=300, bbox_inches='tight') + return fig +``` + +## Best Practices for Qual-Quant Synthesis + +### DO: +- ✅ Use qualitative to explain quantitative patterns +- ✅ Use quantitative to validate qual hypotheses at scale +- ✅ Include direct quotes (the user's voice matters) +- ✅ Look for contradictions (they reveal hidden complexity) +- ✅ Acknowledge limitations of both data sources + +### DON'T: +- ❌ Cherry-pick quotes to fit your narrative +- ❌ Ignore quantitative evidence that contradicts qualitative +- ❌ Treat a few loud voices as representative +- ❌ Use qualitative as a replacement for statistical rigor +- ❌ Forget to quantify qualitative themes (counts matter) + +## Deliverable Checklist + +- [ ] `14_qualitative_analysis.ipynb` notebook created +- [ ] Feedback data loaded and cleaned +- [ ] Categorization function implemented +- [ ] Theme distribution calculated and visualized +- [ ] Representative quotes extracted for each theme +- [ ] Sentiment analysis performed (optional but recommended) +- [ ] Connection found between quantitative finding and qualitative themes +- [ ] Synthesis paragraph written with integrated narrative +- [ ] Supporting evidence (quotes + numbers) included +- [ ] Actionable insight derived from synthesis + +## Key Takeaways + +1. **Numbers without context are incomplete:** "60% drop-off" needs the "why" that only users can provide +2. **Quotes without scale are anecdotes:** Individual voices must be quantified to understand prevalence +3. **Look for explanations, not just confirmations:** Qualitative data is most powerful when it explains quantitative patterns +4. **Contradictions are insights:** When qual and quant disagree, dig deeper—there's a hidden variable +5. **The synthesis is the product:** Your job isn't to present two separate analyses; it's to weave them into one story + +--- + +**Remember:** The best product analysts are translators. You translate numbers into stories and stories into numbers. You make data speak with the voice of real users, and you give user voices the weight of statistical evidence. This synthesis—this ability to move fluidly between the quantitative and qualitative—is what makes you indispensable to a product team.