Skip to content

Desktop: use dev Firebase config for dev builds#5537

Open
beastoin wants to merge 40 commits intocollab/5396-integrationfrom
collab/5396-ren-focus
Open

Desktop: use dev Firebase config for dev builds#5537
beastoin wants to merge 40 commits intocollab/5396-integrationfrom
collab/5396-ren-focus

Conversation

@beastoin
Copy link
Collaborator

@beastoin beastoin commented Mar 10, 2026

Summary

  • Swap GoogleService-Info-Dev.plist from prod (based-hardware) to dev (based-hardware-dev) Firebase project
  • Read API_KEY from the active plist at runtime instead of hardcoding the prod key in AuthService.swift
  • Fix dev.sh and reset-and-run.sh to also copy the dev plist (they were copying prod)

Changes

File Change
GoogleService-Info-Dev.plist API_KEY, PROJECT_ID, STORAGE_BUCKET, GCM_SENDER_ID, GOOGLE_APP_ID → dev values
AuthService.swift Read API_KEY from bundle GoogleService-Info.plist at runtime (fallback: prod key)
dev.sh Copy dev plist instead of prod (matches run.sh logic)
reset-and-run.sh Copy dev plist instead of prod (matches run.sh logic)

How it works

  • run.sh, dev.sh, reset-and-run.sh (dev builds) copy GoogleService-Info-Dev.plist as GoogleService-Info.plist into the app bundle
  • build.sh (prod builds) copies GoogleService-Info.plist (prod) into the bundle
  • AuthService reads whichever plist is in the bundle: dev key for dev builds, prod key for prod builds
  • Zero impact on prod: GoogleService-Info.plist (prod) is unchanged

CODEx review findings addressed

Test plan

  • ./run.sh builds Omi Dev with based-hardware-dev Firebase
  • ./dev.sh builds Omi Dev with based-hardware-dev Firebase
  • Auth flow works against dev Firebase
  • Prod plist unchanged, build.sh still uses prod Firebase

Fixes #5536

by AI for @beastoin

beastoin and others added 30 commits March 7, 2026 05:13
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…5396)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WebSocket client that connects to /v4/listen with Bearer auth and
sends screen_frame JSON messages. Routes focus_result responses back
to callers via async continuations with frame_id correlation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
#5396)

Replace direct Gemini API calls with backend WebSocket screen_frame messages.
Context building (goals, tasks, memories, AI profile) moves server-side.
Client becomes thin: encode JPEG→base64, send screen_frame, receive focus_result.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#5396)

Start WS connection when monitoring starts, disconnect on stop.
Pass service to FocusAssistant (shared for future assistant types).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…5396)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Vision handlers: analyzeFocus, extractTasks, extractMemories, generateAdvice
(send screen_frame with analyze type, receive typed result via frame_id)

Text handlers: generateLiveNote, requestProfile, rerankTasks, deduplicateTasks
(send typed JSON message, receive result via single-slot continuation)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace GeminiClient tool-calling loop with backendService.extractTasks().
Remove extractTaskSingleStage, refreshContext, vector/keyword search,
validateTaskTitle — all LLM logic now server-side. -550 lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace GeminiClient.sendRequest with backendService.extractMemories().
Remove prompt/schema building — all LLM logic now server-side.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace 2-phase Gemini tool-calling loop (execute_sql + vision) with
backendService.generateAdvice(). Remove compressForGemini, getUserLanguage,
buildActivitySummary, buildPhase1/2Tools — all LLM logic server-side. -560 lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace GeminiClient with backendService.deduplicateTasks(). Remove
prompt/schema building, local dedup logic — server handles everything.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace GeminiClient with backendService.rerankTasks(). Remove prompt/
schema building, context fetching — server handles reranking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
beastoin and others added 6 commits March 8, 2026 10:36
Replace 2-stage Gemini profile generation with backendService.requestProfile().
Remove fetchDataSources, buildPrompt, buildConsolidationPrompt — server
fetches user data from Firestore and generates profile server-side.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ts (#5396)

Pass shared BackendProactiveService to all 4 assistants and 3 text-only
services. Remove do/catch since inits no longer throw. Update
AdviceTestRunnerWindow fallback creation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace direct GeminiClient usage with BackendProactiveService.
Uses configure(backendService:) singleton pattern matching other
text-based services. Prompt logic moves server-side.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add configure(backendService:) call for LiveNotesMonitor alongside
other singleton text-based services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update GoogleService-Info-Dev.plist with dev Firebase values:
API_KEY, PROJECT_ID, STORAGE_BUCKET, GCM_SENDER_ID, GOOGLE_APP_ID.

Fixes #5536

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dev builds load GoogleService-Info-Dev.plist (via run.sh), prod builds
load GoogleService-Info.plist. AuthService now reads API_KEY from
whichever plist is in the bundle, with prod key as fallback.

Fixes #5536

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 10, 2026

Greptile Summary

This PR bundles two distinct sets of changes: the headline fix (swapping GoogleService-Info-Dev.plist to the correct based-hardware-dev Firebase project and making AuthService read the API key from the active plist at runtime), plus a large Phase-2 feature that replaces direct on-device Gemini calls in all four proactive assistants with a new BackendProactiveService WebSocket client that sends screenshots to the Python backend for server-side LLM processing.

Key changes:

  • GoogleService-Info-Dev.plist now correctly points to the based-hardware-dev Firebase project (API key, project ID, storage bucket, sender ID, app ID).
  • AuthService.swift reads the Firebase API key from the bundle's GoogleService-Info.plist at runtime, with the prod key as a fallback — zero impact on prod builds.
  • Seven new Pydantic event models added to backend/models/message_event.py for the desktop proactive AI event protocol.
  • New backend/utils/desktop/ modules implement server-side focus analysis, task extraction, memory extraction, advice generation, live note generation, profile generation, and task ops (rerank/dedup).
  • BackendProactiveService.swift is a new WebSocket client that routes typed responses back to callers via CheckedContinuation keyed on frame_id. Timeout tasks spawned by sendAndTimeout/sendAndTimeoutSingle are never cancelled after a response is received, causing a steady accumulation of sleeping tasks.
  • FocusAssistant.analyzeScreenshot is missing autoreleasepool around the base64 encoding of screenshot data, inconsistent with the other two assistants.
  • In AdviceAssistant, the adviceDict["has_advice"] key lookup in extractAdvice and testAnalyze is always nil — the backend event only sends the nested advice content dict (not the outer response dict that contains has_advice). The fallback !adviceDict.isEmpty is always the effective logic, which works correctly but the dead check is misleading.

Confidence Score: 3/5

  • Safe to merge for the Firebase config fix; the Phase-2 backend migration is functionally correct but has minor resource and dead-code issues worth addressing.
  • The headline Firebase change is clean and low-risk. The larger Phase-2 migration is well-structured, with tests and a clear protocol. The main concerns are: timeout tasks that accumulate indefinitely in BackendProactiveService, a missing autoreleasepool in FocusAssistant for large screenshot data, and dead-code has_advice key checks in AdviceAssistant that, while not causing incorrect behavior, indicate the backend contract is not fully understood client-side. None of these are blocking bugs, but they should be addressed before shipping to a wider audience.
  • BackendProactiveService.swift (timeout task accumulation) and AdviceAssistant.swift (dead has_advice check) need the most attention.

Important Files Changed

Filename Overview
desktop/Desktop/Sources/AuthService.swift Replaces the hardcoded prod Firebase API key with a runtime lookup from the active GoogleService-Info.plist bundle; falls back to the prod key if the plist is missing. Clean, safe change with zero prod impact.
desktop/Desktop/Sources/GoogleService-Info-Dev.plist Swaps all Firebase project identifiers (API_KEY, PROJECT_ID, STORAGE_BUCKET, GCM_SENDER_ID, GOOGLE_APP_ID) from the prod based-hardware project to the dev based-hardware-dev project. This is the core change the PR title describes.
desktop/Desktop/Sources/ProactiveAssistants/Core/BackendProactiveService.swift New WebSocket client that replaces direct Gemini calls with server-side processing. Timeout tasks (sendAndTimeout / sendAndTimeoutSingle) are never cancelled after a successful response, leading to a steady accumulation of sleeping tasks over time.
desktop/Desktop/Sources/ProactiveAssistants/Assistants/Focus/FocusAssistant.swift Migrated from direct GeminiClient to BackendProactiveService. The analyzeScreenshot method is missing autoreleasepool around jpegData.base64EncodedString(), unlike the parallel implementations in MemoryAssistant and AdviceAssistant.
desktop/Desktop/Sources/ProactiveAssistants/Assistants/Advice/AdviceAssistant.swift Migrated from a complex two-phase GeminiClient agentic loop to a thin backend client. The "has_advice" key lookup in extractAdvice and testAnalyze is dead code — the backend sends the nested advice dict without this key; the fallback !adviceDict.isEmpty is always the effective logic.
backend/routers/transcribe.py Adds WebSocket message handlers for screen_frame, live_notes_text, profile_request, task_rerank, and task_dedup desktop message types, fanning out to new desktop utility modules. Logic is clean; uid/session_id are correctly captured by closure for all spawned handlers.
backend/models/message_event.py Adds seven new Pydantic event models for desktop proactive AI responses (FocusResultEvent, TasksExtractedEvent, MemoriesExtractedEvent, AdviceExtractedEvent, LiveNoteEvent, ProfileUpdatedEvent, RerankCompleteEvent, DedupCompleteEvent). All follow existing patterns correctly.
backend/utils/desktop/advice.py New module implementing vision-LLM-based advice generation. Uses llm_gemini_flash with structured output (AdviceResult), fetches user goals/tasks for context, and returns a typed dict. Clean implementation.
backend/utils/desktop/task_ops.py New module providing task reranking and deduplication via LLM. Both operations gracefully fall back to empty results on DB errors and use llm_mini for cost efficiency. IDs are typed as str throughout, matching the Swift client's [String] expectation.
desktop/Desktop/Sources/ProactiveAssistants/Assistants/MemoryExtraction/MemoryAssistant.swift Migrated from GeminiClient to BackendProactiveService. Correctly uses autoreleasepool for base64 encoding. Backend response dict parsing is thorough with fallbacks for both Double and Int confidence values.

Sequence Diagram

sequenceDiagram
    participant Desktop as Desktop App (Swift)
    participant WS as WebSocket /v4/listen
    participant Transcribe as transcribe.py
    participant LLM as Gemini Flash (LLM)

    Note over Desktop,WS: Auth flow uses key from active GoogleService-Info.plist

    Desktop->>WS: screen_frame {frame_id, image_b64, analyze: ["focus","tasks","memories","advice"]}
    WS->>Transcribe: message handler

    par Focus analysis
        Transcribe->>LLM: analyze_focus(uid, image_b64, app_name)
        LLM-->>Transcribe: FocusResult
        Transcribe-->>Desktop: focus_result {frame_id, status, app_or_site, description, message}
    and Task extraction
        Transcribe->>LLM: extract_tasks(uid, image_b64, app_name)
        LLM-->>Transcribe: TaskExtractionResult
        Transcribe-->>Desktop: tasks_extracted {frame_id, tasks[]}
    and Memory extraction
        Transcribe->>LLM: extract_memories(uid, image_b64, app_name)
        LLM-->>Transcribe: MemoryExtractionResult
        Transcribe-->>Desktop: memories_extracted {frame_id, memories[]}
    and Advice generation
        Transcribe->>LLM: generate_advice(uid, image_b64, app_name)
        LLM-->>Transcribe: AdviceResult
        Transcribe-->>Desktop: advice_extracted {frame_id, advice}
    end

    Desktop->>WS: live_notes_text {text, session_context}
    Transcribe->>LLM: generate_live_note(text, session_context)
    LLM-->>Transcribe: LiveNote
    Transcribe-->>Desktop: live_note {text}

    Desktop->>WS: task_rerank
    Transcribe->>LLM: rerank_tasks(uid)
    LLM-->>Transcribe: RerankResult
    Transcribe-->>Desktop: rerank_complete {updated_tasks[]}

    Desktop->>WS: task_dedup
    Transcribe->>LLM: dedup_tasks(uid)
    LLM-->>Transcribe: DedupResult
    Transcribe-->>Desktop: dedup_complete {deleted_ids[], reason}
Loading

Last reviewed commit: 7796471

],
required: ["status", "app_or_site", "description"]
private func analyzeScreenshot(jpegData: Data, appName: String, windowTitle: String?) async throws -> ScreenAnalysis? {
let base64 = jpegData.base64EncodedString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing autoreleasepool for screenshot base64 encoding

base64EncodedString() on large JPEG data creates a temporary String (33% larger than the input) that won't be freed until the next autorelease pool drain. The parallel implementations in MemoryAssistant.extractMemories and AdviceAssistant.extractAdvice both wrap this call in autoreleasepool { } — this call should match for consistency and to avoid holding onto a large temporary string longer than necessary.

Suggested change
let base64 = jpegData.base64EncodedString()
let base64 = autoreleasepool { jpegData.base64EncodedString() }

Rule Used: Memory management - free large objects immediately... (source)

Comment on lines +240 to +246
Task { [weak self] in
try? await Task.sleep(nanoseconds: UInt64(timeout * 1_000_000_000))
self?.requestLock.lock()
let cont = remove(frameId)
self?.requestLock.unlock()
cont?.resume(throwing: ServiceError.timeout)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout tasks are never cancelled, accumulating over time

Every call to sendAndTimeout (and sendAndTimeoutSingle) spawns a Task that unconditionally sleeps for the full timeout duration (30 s for vision, 60 s for text) even after the continuation has already been resumed by a successful response. With screen frames arriving every 10–30 s, there will be a steady accumulation of sleeping tasks.

The fix is to return the Task handle and cancel it when the continuation is resolved. For example:

// In sendAndTimeout, return the task handle
let timeoutTask = Task { [weak self] in
    try? await Task.sleep(nanoseconds: UInt64(timeout * 1_000_000_000))
    guard !Task.isCancelled else { return }
    self?.requestLock.lock()
    let cont = remove(frameId)
    self?.requestLock.unlock()
    cont?.resume(throwing: ServiceError.timeout)
}
// Then cancel timeoutTask when the response is received in handleFocusResult, etc.

The same pattern applies to sendAndTimeoutSingle at lines 261–267.

// Fall back to transcription language setting
let fallback = await MainActor.run { AssistantSettings.shared.transcriptionLanguage }
return fallback.isEmpty || fallback == "en" ? nil : fallback
let hasAdvice = adviceDict["has_advice"] as? Bool ?? !adviceDict.isEmpty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"has_advice" key lookup is always nil — dead code

The Python backend sends the nested advice content dict in the advice_extracted event:

# _handle_advice in transcribe.py
_send_message_event(AdviceExtractedEvent(
    frame_id=fid, advice=result.get('advice'),  # ← this is {"content":…, "category":…, "confidence":…}
))

So adviceDict has keys "content", "category", "confidence" — never "has_advice". The lookup adviceDict["has_advice"] as? Bool is therefore always nil, and the actual logic is always the fallback !adviceDict.isEmpty.

The same dead check appears in testAnalyze at line 416.

Consider replacing with:

Suggested change
let hasAdvice = adviceDict["has_advice"] as? Bool ?? !adviceDict.isEmpty
let hasAdvice = !adviceDict.isEmpty

beastoin and others added 4 commits March 10, 2026 07:21
dev.sh builds Omi Dev (com.omi.desktop-dev) but was copying the prod
GoogleService-Info.plist. Now uses the same dev plist logic as run.sh.

Fixes #5536

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reset-and-run.sh builds Omi Dev (com.omi.desktop-dev) but was copying
the prod GoogleService-Info.plist. Now uses the same dev plist logic
as run.sh.

Fixes #5536

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CODEx review: dev builds should not silently use prod credentials.
Now logs a FATAL warning if GoogleService-Info.plist is missing or
has no API_KEY in a dev build (bundle ID ending in -dev).

Fixes #5536

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ck to prod

CODEx review round 2: logging is not fail-fast. Dev builds now crash
with fatalError if GoogleService-Info.plist has no API_KEY, preventing
silent use of prod credentials. Prod builds still fall back safely.

Fixes #5536

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@beastoin
Copy link
Collaborator Author

Independent Verification — PR #5537

Verifier: noa (independent)
Branch: verify/noa-combined-5374-5395-5413-5537 (e3cab73)
SHA verified: 7796471
⚠️ HEAD moved: Remote is now 6d8b57e (4 new commits since verification). This verdict covers 7796471 only.

Scope

Dev Firebase config: remove hardcoded prod Firebase from dev builds. AuthService reads API_KEY from bundle plist at runtime with prod fallback. run.sh copies GoogleService-Info-Dev.plist for dev builds.

Results

Check Result
Backend tests 905 pass — 0 regressions vs main
Swift build PASS (30.58s)
Code review PASS — runtime plist read with prod fallback
Dev plist values Verified: PROJECT_ID=based-hardware-dev, correct API_KEY/GCM_SENDER_ID/GOOGLE_APP_ID
Auth E2E (prod plist) PASS — prod plist works with api.omi.me
test.sh merge Resolved — kept HEAD entries (no new test entries from this PR)

New Commits Not Covered

4 commits after 7796471:

  • 9e8c3a0b4 Fix dev.sh to copy dev Firebase plist
  • e3cfbf160 Fix reset-and-run.sh to copy dev Firebase plist
  • 00ad7c8c8 Log fatal warning when dev build falls back to prod key
  • 6d8b57e8e Crash dev builds when Firebase plist is missing (behavior change from soft fallback)

The crash-on-missing-plist commit (6d8b57e) changes the fallback behavior from what I verified. Re-verification recommended for the new HEAD.

Verdict: CONDITIONAL PASS (for SHA 7796471)

Verified code works correctly. New commits need re-verification — especially the crash-instead-of-fallback change.

@beastoin
Copy link
Collaborator Author

Re-verification — PR #5537 (updated HEAD)

Verifier: noa (independent)
Branch: verify/noa-combined-5374-5395-5413-5537 (1bb7195)
SHA verified: 6d8b57e (current HEAD, matches remote)

New Commits Reviewed

  • 9e8c3a0b4 Fix dev.sh to copy dev Firebase plist instead of prod
  • e3cfbf160 Fix reset-and-run.sh to copy dev Firebase plist instead of prod
  • 00ad7c8c8 Log fatal warning when dev build falls back to prod Firebase key
  • 6d8b57e8e Crash dev builds when Firebase plist is missing instead of falling back to prod

Analysis

  1. AuthService.swift: fatalError only triggers for -dev bundle ID builds when plist is missing. Prod builds still get the hardcoded fallback. Safe — dev.sh/reset-and-run.sh now copy the dev plist, so the guard should never fire in normal operation.
  2. dev.sh / reset-and-run.sh: Both now copy GoogleService-Info-Dev.plist as GoogleService-Info.plist for dev builds, with fallback to prod plist if dev is missing. Correct.

Results

Check Result
Code review PASS — fatalError scoped to dev builds only, build scripts copy correct plist
Swift build PASS (30.63s) with updated AuthService
Ancestry check PASS — 6d8b57e is ancestor of combined branch
Remote sync PASS — pushed to verify/noa-combined-5374-5395-5413-5537

Verdict: PASS (upgraded from CONDITIONAL PASS)

All 4 new commits reviewed and verified. The crash-on-missing-plist is correctly scoped to dev builds only. Combined with previous E2E results, this PR is clear to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant