Skip to content

ace condition in variant lookup — fetch() completes after UI queries in Release builds #74

@kwontaewan

Description

@kwontaewan

Race condition in variant lookup — fetch() completes after UI queries in Release builds

Summary

In Release builds, our app occasionally renders the legacy MyPage instead of the new MyPage V5.
Root cause: variant("my_page_v5_enabled") is evaluated before the first fetch() finishes populating the in-memory cache.
Since variant() reads only from memory (and never re-reads from UserDefaults), it may return an empty/default value.
This issue does not occur in Debug builds (-Onone) but is reproducible in optimized Release builds (-O).


Steps to Reproduce (Simplified)

  1. Cold-install the app.
  2. Launch sequence:
    • LaunchViewModel calls experiment.fetch() (async).
    • Navigation immediately proceeds to MyPageNavigationViewController.
    • The VC calls featureFlagService.flagValue(for: .myPageV5Enabled) → internally calls experiment.variant(...).
  3. In Release, the UI thread may outrun the fetch thread, causing variant() to run before the cache is ready.

Expected: new MyPage V5 is shown.
Actual: legacy MyPage is rendered.


Root-Cause Analysis

SDK Flow Diagram

network thread UI thread

fetch() ───────────► (async)
___________________variant() ───► cache miss
server response
└─► storeVariants()
├─ cache.put() (barrier) ← happens after variant()
└─ cache.store() → async UserDefaults write

Key Points

  • variant() reads only from the in-memory cache (LoadStoreCache.cache).
  • UserDefaults is only loaded once on init (load()), and never re-read.
  • store() to disk happens asynchronously via storageQueue, and there is no completion callback.

Fix Applied in Our App

We moved the initial fetch() call earlier in the launch pipeline—before any view controller can request a variant.
We also added a blocking mechanism (short timeout) to ensure the variant data is ready before proceeding.


SDK-Level Suggestions

  1. Early-Fetch Helper
    Provide a utility (e.g., Experiment.preloadVariants(...)) intended for application(_:didFinishLaunching).

  2. Ready Callback
    Expose a callback/future that fires only when both in-memory and disk persistence are completed.

  3. Blocking Read Option
    Offer variant(key, waitIfMissing: Bool = false) that suspends until fetch() completes.


Why This Matters

  • Release builds are heavily optimized; UI flows may outpace async network/storage tasks, leading to inconsistent feature flag reads.
  • Making the expected lifecycle clearer—or offering synchronous utilities—would help all integrators avoid this race condition.

Thanks for reviewing! Let us know if the SDK team has preferred guidance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions