Skip to content

[Demo] Potential Race Condition in StreamMessageLabViewModel Diagnostics #64

@leogdion

Description

@leogdion

Component

Demo Applications (Stream)

Location

Examples/Sundial/Sources/SundialDemoStream/ViewModels/StreamMessageLabViewModel.swift:163-189

Problem

The diagnostic logging code contains a potential race condition with multiple sequential await calls to the connectivityObserver actor, and includes an unexplained 500ms delay.

Code in Question

// StreamMessageLabViewModel.swift:163-172
Task { @MainActor in
  try? await Task.sleep(for: .milliseconds(500))
  let reachable = await connectivityObserver.isReachable()
  let activationState = await connectivityObserver.getCurrentActivationState()
  let pairedAppInstalled = await connectivityObserver.isPairedAppInstalled()
  #if os(iOS)
    let paired = await connectivityObserver.isPaired()
  #else
    let paired = true
  #endif
  // ... logging follows
}

Issues Identified

1. Arbitrary 500ms Delay

Problem:

  • try? await Task.sleep(for: .milliseconds(500)) appears arbitrary
  • No code comment explaining why 500ms specifically
  • Comment says "INITIAL STATE (500ms after activation)" but doesn't explain the rationale

Questions:

  • Why 500ms? Is this based on WatchConnectivity framework behavior?
  • What happens if activation takes longer than 500ms?
  • Why not use a proper activation state callback instead?

2. Silent Error Swallowing

Problem:

  • try? silently discards any error from Task.sleep
  • While unlikely, this could mask unexpected errors

Recommendation:

// Be explicit about intentionally ignoring errors
do {
  try await Task.sleep(for: .milliseconds(500))
} catch {
  // Intentionally ignoring sleep cancellation
}

3. Potential Race Condition

Problem:
Multiple sequential await calls to the actor may retrieve inconsistent state if changes occur between calls:

let reachable = await connectivityObserver.isReachable()          // State snapshot 1
let activationState = await connectivityObserver.getCurrentActivationState()  // State snapshot 2
let pairedAppInstalled = await connectivityObserver.isPairedAppInstalled()   // State snapshot 3

If connectivity state changes between these calls, the logged state may never have actually existed as a cohesive state.

Example Scenario:

  1. First await: isReachable = true
  2. Watch goes out of range
  3. Second await: activationState = .activated
  4. Third await: isReachable = false (now inconsistent with step 1)

4. Logged State May Not Match Actual State

The diagnostic logs claim to show state at a specific point in time, but actually show state across multiple async operations.

Recommended Solutions

Option 1: Atomic State Snapshot (Recommended)

Add a method to ConnectivityObserver that returns all state atomically:

// In ConnectivityObserver actor
func getDiagnosticSnapshot() async -> DiagnosticState {
  DiagnosticState(
    isReachable: session.isReachable,
    activationState: session.activationState,
    isPairedAppInstalled: session.isPairedAppInstalled,
    isPaired: session.isPaired  // iOS only
  )
}

Then use:

Task { @MainActor in
  try await Task.sleep(for: .milliseconds(500))
  let snapshot = await connectivityObserver.getDiagnosticSnapshot()
  // Log snapshot (guaranteed consistent state)
}

Option 2: Document the Race Condition

If the race condition is acceptable for diagnostic purposes, add a comment:

// Note: These await calls may return inconsistent state if connectivity
// changes between calls. This is acceptable for diagnostic logging purposes.
// For production code, use getDiagnosticSnapshot() to get atomic state.

Option 3: Explain the 500ms Delay

Add a comprehensive comment explaining the delay:

// Wait 500ms after activation to allow WatchConnectivity framework to settle.
// The activation state may transition through multiple states quickly:
// .inactive -> .notActivated -> .activated
// This delay ensures we log the settled state rather than a transient state.
//
// Note: This is for diagnostic purposes only. Production code should use
// activation state callbacks rather than arbitrary delays.
try await Task.sleep(for: .milliseconds(500))

Testing Recommendations

To verify the race condition impact:

  1. Add instrumentation:

    let start = Date()
    let reachable = await connectivityObserver.isReachable()
    let t1 = Date().timeIntervalSince(start)
    let activationState = await connectivityObserver.getCurrentActivationState()
    let t2 = Date().timeIntervalSince(start)
    // ... log timestamps to see if they span significant time
  2. Simulate state changes:

    • Put device to sleep during the 500ms window
    • Toggle airplane mode during diagnostic capture
    • Move watch in/out of range during capture
  3. Compare logged vs actual state:

    • Verify logs match observed behavior
    • Check if race condition causes misleading diagnostics

Priority

Medium - This affects diagnostic logging quality but not core functionality. However, it may lead to confusing debug output if the race condition manifests.

Related Issues

Files Affected

  • Examples/Sundial/Sources/SundialDemoStream/ViewModels/StreamMessageLabViewModel.swift:163-189

Additional Context

The WATCHCONNECTIVITY_DIAGNOSTICS.md document added in PR #59 explains some of these behaviors, but doesn't address the potential race condition in the diagnostic capture code itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    component:demoDemo applications and examplestype:bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions