Verify: Combined desktop PRs #5374 #5395 #5413 #5537 by beastoin · Pull Request #5539 · BasedHardware/omi

beastoin · 2026-03-10T07:31:35Z

Combined Verification Branch

Independent verification of 4 coupled desktop PRs merged in dependency order.

Verifier: noa (independent — not the author of any sub-PR)
Branch: verify/noa-combined-5374-5395-5413-5537
Merge order: #5374 → #5395 → #5413 → #5537

Sub-PRs

PR	Branch	SHA	Scope
#5374	`collab/5302-integration`	`40ae983af`	Desktop auth + base integration + agent VM fix
#5395	`fix/desktop-stt-backend-5393`	`e2a885734`	Desktop STT backend migration (Deepgram → BackendTranscriptionService)
#5413	`collab/5396-integration`	`15bf1ec6a`	Desktop proactive AI thin clients
#5537	`collab/5396-ren-focus`	`6d8b57e8e`	Dev Firebase config (runtime plist read, crash on missing dev plist)

All sub-PR HEADs verified. kai confirmed no changes since last check (2026-03-10).

Verification Results

Backend Tests

Suite	Combined Branch	Main Baseline	Delta
PASS	1089	798	+291 new
FAIL	16	14	+2 (pre-existing)
ERROR	40	1	+39 (pre-existing, GCP creds)

Zero new failures. All 16 fails and 40 errors verified pre-existing on main.

Local Backend E2E (2026-03-10)

Set up local Python backend from combined branch on Mac Mini, wired desktop app to http://localhost:8000.

Pipeline verified end-to-end:

Desktop app → ws://localhost:8000/v4/listen → Deepgram STT → transcript segments → app

Metric	Result
WebSocket `/v4/listen`	✅ Accepted
Deepgram connection	✅ Started via backend
Segments captured	35 in session 31
Transcript accuracy	✅ Full text captured correctly
Reconnection	✅ Auto-reconnect after disconnect
Auth (prod Firebase tokens)	✅ Verified

Declarative E2E Flows (4/4 PASS)

Flow	Test	Result
1	Auth & Session Bootstrap	✅ PASS — kill/relaunch, session restored
2	Live Audio Transcription	✅ PASS — 35 segments via local backend
3	Screen Analysis Settings	✅ PASS — General, Rewind, Privacy rendered
4	Navigation	✅ PASS — Dashboard, Chat, Memories, Tasks, Apps, Settings

Swift Build

Build complete! (30.63s)

Mac Mini E2E

Check	Result
Google OAuth sign-in	PASS
Firebase token exchange	PASS
Dashboard load	PASS — 411ms
DEEPGRAM mentions in log	ZERO — PR #5395 fix confirmed
Auto-start transcription	PASS — BackendTranscriptionService engaged
TranscriptionStorage sync	PASS — 7 sessions, 791 segments

Codex Audit

0 CRITICAL, 12 WARNING (all non-blocking)

Key warnings: W-1 (dual WebSocket to /v4/listen), W-2 (fatalError on missing OMI_API_URL), W-3 (isConnected timing), W-9 (no token refresh in long sessions), W-12 (screen_frame image_b64 not size-limited)

Cross-PR Interaction

Interaction	Status
AppState.swift (#5374 + #5395)	Clean — different properties
AuthService (#5374 + #5537)	Clean — apiBaseURL vs firebaseApiKey
backend/test.sh (#5374 + #5413)	Clean — distinct entries
/v4/listen dual usage (#5395 + #5413)	OK — audio vs JSON

Verdict

PASS — all 4 PRs. Zero test regressions, local backend E2E confirmed, clean Codex audit (0 CRITICAL). Ready for merge.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…5396) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

WebSocket client that connects to /v4/listen with Bearer auth and sends screen_frame JSON messages. Routes focus_result responses back to callers via async continuations with frame_id correlation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

#5396) Replace direct Gemini API calls with backend WebSocket screen_frame messages. Context building (goals, tasks, memories, AI profile) moves server-side. Client becomes thin: encode JPEG→base64, send screen_frame, receive focus_result. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…#5396) Start WS connection when monitoring starts, disconnect on stop. Pass service to FocusAssistant (shared for future assistant types). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…5396) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Vision handlers: analyzeFocus, extractTasks, extractMemories, generateAdvice (send screen_frame with analyze type, receive typed result via frame_id) Text handlers: generateLiveNote, requestProfile, rerankTasks, deduplicateTasks (send typed JSON message, receive result via single-slot continuation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace GeminiClient tool-calling loop with backendService.extractTasks(). Remove extractTaskSingleStage, refreshContext, vector/keyword search, validateTaskTitle — all LLM logic now server-side. -550 lines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace GeminiClient.sendRequest with backendService.extractMemories(). Remove prompt/schema building — all LLM logic now server-side. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace 2-phase Gemini tool-calling loop (execute_sql + vision) with backendService.generateAdvice(). Remove compressForGemini, getUserLanguage, buildActivitySummary, buildPhase1/2Tools — all LLM logic server-side. -560 lines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace GeminiClient with backendService.deduplicateTasks(). Remove prompt/schema building, local dedup logic — server handles everything. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace GeminiClient with backendService.rerankTasks(). Remove prompt/ schema building, context fetching — server handles reranking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace 2-stage Gemini profile generation with backendService.requestProfile(). Remove fetchDataSources, buildPrompt, buildConsolidationPrompt — server fetches user data from Firestore and generates profile server-side. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ts (#5396) Pass shared BackendProactiveService to all 4 assistants and 3 text-only services. Remove do/catch since inits no longer throw. Update AdviceTestRunnerWindow fallback creation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace direct GeminiClient usage with BackendProactiveService. Uses configure(backendService:) singleton pattern matching other text-based services. Prompt logic moves server-side. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add configure(backendService:) call for LiveNotesMonitor alongside other singleton text-based services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update GoogleService-Info-Dev.plist with dev Firebase values: API_KEY, PROJECT_ID, STORAGE_BUCKET, GCM_SENDER_ID, GOOGLE_APP_ID. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Dev builds load GoogleService-Info-Dev.plist (via run.sh), prod builds load GoogleService-Info.plist. AuthService now reads API_KEY from whichever plist is in the bundle, with prod key as fallback. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nflict)

…onflict)

dev.sh builds Omi Dev (com.omi.desktop-dev) but was copying the prod GoogleService-Info.plist. Now uses the same dev plist logic as run.sh. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

reset-and-run.sh builds Omi Dev (com.omi.desktop-dev) but was copying the prod GoogleService-Info.plist. Now uses the same dev plist logic as run.sh. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CODEx review: dev builds should not silently use prod credentials. Now logs a FATAL warning if GoogleService-Info.plist is missing or has no API_KEY in a dev build (bundle ID ending in -dev). Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ck to prod CODEx review round 2: logging is not fail-fast. Dev builds now crash with fatalError if GoogleService-Info.plist has no API_KEY, preventing silent use of prod credentials. Prod builds still fall back safely. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…sts)

…out/IP

greptile-apps · 2026-03-10T07:37:02Z

Greptile Summary

This combined verification branch merges four desktop PRs: Google OAuth desktop auth (#5374), Deepgram → BackendTranscriptionService migration (#5395), proactive AI thin clients over /v4/listen (#5413), and dev Firebase plist runtime selection (#5537). The backend gains 8+ new routers (staged tasks, focus sessions, advice, screen activity, desktop chat sessions) plus proactive AI utils that fan out vision-LLM work from the existing transcription WebSocket. The Swift side replaces TranscriptionService with BackendTranscriptionService and restructures all proactive assistants to delegate LLM work to the backend.

Key findings:

Logic bug — auth fallback always fails: The new _generate_custom_token fallback calls firebase_admin.auth.verify_id_token(id_token) using a raw Google/Apple OAuth ID token. The Admin SDK verifier expects a Firebase-issued token; it will reject provider OAuth tokens and the fallback will raise rather than recover.
Logic bug — focused_minutes holds session counts: FocusStatsResponse.focused_minutes and distracted_minutes are assigned focused_count / distracted_count (number of sessions) instead of actual duration in minutes, misleading any client that renders time-based focus stats.
Style — isConnected set on 0.5 s timer: BackendTranscriptionService marks the WebSocket connected after a fixed delay rather than on a confirmed server handshake, creating a timing race on slow connections (flagged as W-3 in the PR audit).
Style — image_b64 not size-capped: Screen-frame payloads are forwarded to vision LLMs without any byte-length guard, allowing large images to burn quota or stall async workers.
Style — create_staged_task full collection scan: Description deduplication at creation time streams the entire staged_tasks collection per call, which is O(n) on a hot path (up to four parallel fan-outs per screen frame).

Confidence Score: 3/5

Mostly safe to merge with the noted bugs addressed; the auth fallback regression could surface as a hard sign-in failure when the Firebase REST API is unavailable.
The Swift build passes, E2E tests pass, and the majority of changes are clean additions. However, two logic bugs lower confidence: the firebase_admin.auth.verify_id_token fallback will always raise for provider OAuth tokens (silent auth regression), and focused_minutes reports session counts not durations. The timing race in BackendTranscriptionService and the unguarded image_b64 size are lower-severity but worth addressing before wide rollout.
backend/routers/auth.py (auth fallback logic) and backend/routers/focus_sessions.py (focus stats fields) need the most attention before merge.

Important Files Changed

Filename	Overview
backend/routers/auth.py	Added redirect_uri validation and a fallback auth path; the Admin SDK fallback calls `verify_id_token` with a provider (Google/Apple) OAuth token instead of a Firebase token, which will always fail.
backend/routers/focus_sessions.py	New focus-sessions router; `focused_minutes`/`distracted_minutes` fields in the stats response are assigned session counts rather than actual durations, misrepresenting the data to clients.
backend/routers/transcribe.py	Added fan-out handlers for new `screen_frame`, `live_notes_text`, `profile_request`, `task_rerank`, and `task_dedup` WebSocket message types; no size guard on `image_b64` before passing to vision LLMs.
backend/database/staged_tasks.py	New Firestore CRUD for staged tasks with promote-to-action-items logic; `create_staged_task` does a full collection scan per creation for dedup, which is O(n) on a hot path.
desktop/Desktop/Sources/BackendTranscriptionService.swift	New service replacing direct Deepgram with backend `/v4/listen` WebSocket; `isConnected` is set after a fixed 0.5 s timer rather than on confirmed handshake, creating a timing race on slow connections.
desktop/Desktop/Sources/AppState.swift	Migrated from `TranscriptionService` (Deepgram) to `BackendTranscriptionService`; added `backendOwnsConversation` guard to skip client-side segment upload, preventing duplicates. Clean refactor.
desktop/Desktop/Sources/AuthService.swift	Replaced hardcoded Firebase API key and auth base URL with runtime reads from bundle plist and `OMI_API_URL`; `fatalError` for missing dev plist is correctly scoped to `-dev` bundle suffix only.
backend/routers/staged_tasks.py	New staged-tasks and daily/weekly/overall score endpoints; promote logic has a small TOCTOU window (count check and promotion are not atomic), but unlikely to cause issues in practice.
backend/routers/chat.py	Added desktop chat session CRUD (`/v2/chat-sessions`), message persistence (`/v2/messages/save`), rating, and LLM title generation. Clean, well-validated additions.
backend/routers/screen_activity.py	New `/v1/screen-activity/sync` endpoint with 100-row batch limit, Firestore upsert, and fire-and-forget Pinecone vector upsert. Well-structured.

Sequence Diagram

sequenceDiagram
    participant Desktop as Desktop App
    participant WS as /v4/listen WebSocket
    participant Backend as FastAPI Backend
    participant LLM as Vision LLM (Gemini Flash)
    participant DB as Firestore

    Note over Desktop,WS: BackendTranscriptionService (audio stream)
    Desktop->>WS: PCM audio frames (binary)
    WS->>Backend: _stream_handler
    Backend-->>Desktop: [TranscriptSegment] JSON

    Note over Desktop,WS: BackendProactiveService (proactive AI)
    Desktop->>WS: {type: "screen_frame", image_b64, analyze: ["focus","tasks"]}
    WS->>Backend: _stream_handler (same connection)
    par focus analysis
        Backend->>LLM: analyze_focus(uid, image_b64)
        LLM-->>Backend: FocusResult
        Backend-->>Desktop: {type: "focus_result", status, app_or_site}
    and task extraction
        Backend->>DB: get_action_items(uid)
        Backend->>LLM: extract_tasks(uid, image_b64)
        LLM-->>Backend: TaskExtractionResult
        Backend-->>Desktop: {type: "tasks_extracted", tasks:[...]}
    end

    Desktop->>WS: {type: "task_dedup"}
    Backend->>DB: get_action_items(uid)
    Backend->>LLM: dedup_tasks(uid)
    LLM-->>Backend: DedupResult
    Backend-->>Desktop: {type: "dedup_complete", deleted_ids:[...]}
    Desktop->>Backend: DELETE /v1/staged-tasks/{id} (per deleted_id)

_{Last reviewed commit: 1bb7195}

greptile-apps · 2026-03-10T07:37:06Z

backend/routers/focus_sessions.py

+        key=lambda d: d.total_seconds,
+        reverse=True,
+    )[:5]
+
+    return FocusStatsResponse(
+        date=date,
+        focused_minutes=focused_count,
+        distracted_minutes=distracted_count,


focused_minutes/distracted_minutes hold session counts, not duration

focused_minutes and distracted_minutes are being populated with session counts (focused_count / distracted_count), not actual durations in minutes. The duration in seconds is only tracked for distracted sessions (in distraction_map), but focused duration is never computed. Clients relying on focused_minutes to display time-based stats (e.g., "you were focused for 45 minutes") will show incorrect data — the count of focus sessions instead.

To fix, either compute actual minutes from duration_seconds like distractions do, or rename the fields to focused_count and distracted_count to match their actual semantics (the model already has separate focused_count/distracted_count fields that are also set to the same values):

Suggested change

key=lambda d: d.total_seconds,

reverse=True,

)[:5]

return FocusStatsResponse(

date=date,

focused_minutes=focused_count,

distracted_minutes=distracted_count,

return FocusStatsResponse(

date=date,

focused_minutes=sum(

(s.get('duration_seconds') or 60) // 60

for s in sessions if s.get('status') == 'focused'

),

distracted_minutes=sum(

(s.get('duration_seconds') or 60) // 60

for s in sessions if s.get('status') == 'distracted'

),

session_count=focused_count + distracted_count,

focused_count=focused_count,

distracted_count=distracted_count,

top_distractions=top_distractions,

)

greptile-apps · 2026-03-10T07:37:07Z

desktop/Desktop/Sources/BackendTranscriptionService.swift

+            } catch {
+                logError("BackendTranscriptionService: Failed to get auth token", error: error)
+                self.onError?(BackendTranscriptionError.notSignedIn)
+            }
+        }
+    }
+
+    private func connectWithToken(_ token: String, baseURL: String) {
+
+        // Convert http(s) to ws(s)
+        let wsBaseURL: String
+        if baseURL.hasPrefix("https://") {


isConnected flagged after a fixed 0.5 s delay, not on actual handshake completion

The 0.5 s asyncAfter is an approximation — the backend's /v4/listen WebSocket handshake can take longer on slow or congested connections, meaning onConnected fires and audio capture starts before the server has actually acknowledged the session. On the flip side, if the TCP-level connect fails within 0.5 s, webSocketTask?.state == .running can still be .running (the task is "running" as far as the URL session knows, regardless of HTTP 101 status).

The PR audit already flagged this as W-3. A more reliable approach is to treat the first successfully received message (any transcript or service_status event) as the real connection confirmation, or — if the backend sends a connected preamble — wait for that before setting isConnected = true and invoking onConnected.

As-is, under slow networks users will experience a brief window where audio is buffered client-side but not yet accepted server-side, and the keepalive / watchdog timers start from the wrong origin.

greptile-apps · 2026-03-10T07:37:08Z

backend/routers/transcribe.py

+                                logger.warning(f"screen_frame missing image_b64 {uid} {session_id}")
+                            else:
+                                # Fan out to parallel handlers per analyze type
+                                if 'focus' in analyze_types:
+                                    async def _handle_focus(fid, img, app, wtitle):
+                                        try:
+                                            result = await analyze_focus(uid=uid, image_b64=img, app_name=app, window_title=wtitle)
+                                            _send_message_event(FocusResultEvent(


image_b64 passed to vision LLMs without any size guard

A single base64-encoded JPEG screenshot can easily be 200–500 KB (or larger for high-DPI displays). With no validation on image_b64 length here, a malformed or deliberately oversized payload will be forwarded verbatim to analyze_focus, extract_tasks, extract_memories, and generate_advice — all of which inline the data in a Gemini Flash image_url message. This can:

Cause unexpected LLM API errors / quota burn on very large images.

Allow any authenticated user to send arbitrarily large payloads that stall async workers.

The PR audit flagged this as W-12. Add a size cap before fanning out:

MAX_IMAGE_B64_BYTES = 1_500_000 # ~1.1 MB decoded ≈ 4 MP JPEG at high quality if len(image_b64) > MAX_IMAGE_B64_BYTES: logger.warning(f"screen_frame image_b64 too large ({len(image_b64)} bytes), dropping {uid} {session_id}") else: # fan out handlers

greptile-apps · 2026-03-10T07:37:09Z

backend/database/staged_tasks.py

+
+    # Dedup: check for existing task with same description (case-insensitive)
+    normalized = description.lower()
+    for doc in ref.stream():
+        existing = doc.to_dict()
+        if existing.get('deleted'):
+            continue
+        if existing.get('description', '').strip().lower() == normalized:
+            existing['id'] = doc.id
+            return _prepare_for_read(existing)


Full collection scan for description deduplication on every create_staged_task

ref.stream() fetches every non-deleted document in the user's staged_tasks collection to do a case-insensitive description match. For a user with a large backlog (hundreds of staged items accumulated over time) this results in an O(n) Firestore read on every creation call. The screen_frame handler fans out up to four parallel analyses per frame, each of which could call create_staged_task — making this a hot path.

Consider storing a lowercase-normalised description hash as a dedicated indexed field and querying with where('desc_hash', '==', normalized_hash) instead of streaming the entire collection. Alternatively, accept that exact-dedup at creation time is best-effort and rely on the existing promote-time dedup in the router, which only reads the first 20 staged tasks.

greptile-apps · 2026-03-10T07:37:10Z

backend/routers/auth.py

+            }
+
+            response = requests.post(sign_in_url, json=payload)
+
+            if response.status_code == 200:
+                result = response.json()
+                firebase_uid = result.get('localId')
+                if firebase_uid:
+                    logger.info(f"Firebase sign-in successful for {provider}, UID: {firebase_uid}")
+            else:
+                logger.warning(
+                    f"Firebase REST API sign-in failed (status={response.status_code}), falling back to Admin SDK"
+                )
+


Admin SDK fallback will always fail for OAuth provider tokens

firebase_admin.auth.verify_id_token(id_token) expects a Firebase ID token — a JWT signed by Firebase Auth with iss pointing to securetoken.google.com/{project}. However, the id_token here is the raw Google (or Apple) OAuth ID token from the provider's code-exchange response (iss = accounts.google.com). The Admin SDK will reject it because the signing keys and iss claim don't match Firebase Auth's expectations.

This means the fallback path — intended as a resilience measure when FIREBASE_API_KEY is unset or the REST API returns non-200 — will itself raise an exception, turning a potentially recoverable failure into a hard auth error for the user.

Fix options:

Remove the fallback entirely and require FIREBASE_API_KEY to always be configured (fail with a clear message if missing).

Extract the email from the provider token via Google's public tokeninfo endpoint instead of calling the Firebase Admin verifier, then proceed with the existing user-lookup / create logic.

…21 total)

…solve test.sh conflict)

beastoin · 2026-03-10T07:54:35Z

Update: PR #5374 VM endpoint fix merged (`40ae983`)

New #5374 HEAD: 40ae983 (was 78d15d2)
Combined branch: 0a7b431

What changed

8 new commits adding agent VM creation + status fields:

vm-ensure now creates new GCE VMs for first-time users (was returning has_vm: False)
vm-status returns all fields desktop needs: vm_name, ip, auth_token, zone, created_at, last_query_at
vm-status triggers restart for stopped VMs (Rust parity)
21 new tests in test_agent_vm.py

Test Results (updated)

Suite	Combined Branch	Main Baseline	Delta
PASS	1110	798	+312 new
FAIL	16	14	+2 (pre-existing)
ERROR	40	1	+39 (pre-existing, GCP creds)

All 21 new VM tests pass. Zero new failures. test.sh conflict resolved (keep all entries).

Ancestry

All 4 sub-PR HEADs verified as ancestors:

beastoin · 2026-03-10T13:12:23Z

Local Backend E2E Verification Update

Date: 2026-03-10
Combined branch: 0841bd3 on Mac Mini

Setup

Local Python backend running from combined branch code at localhost:8000
Desktop app .env: OMI_API_URL=http://localhost:8000
Cross-project auth: Firebase Admin verifies prod tokens (based-hardware), Firestore uses dev project (based-hardware-dev)
Pusher disabled (local testing mode)

Live Transcription Test

Pipeline: Desktop app → ws://localhost:8000/v4/listen → Deepgram STT → transcript segments → app

Metric	Result
WebSocket connection	✅ Accepted
Deepgram connection	✅ Started via backend
Segments captured	35 in session 31
Transcript accuracy	✅ Full text captured correctly
Reconnection	✅ Auto-reconnect after disconnect

App log evidence:

BackendTranscriptionService: Connecting to ws://localhost:8000/v4/listen?language=multi&sample_rate=16000&codec=pcm16&channels=1&source=desktop
Deepgram connection started: True
Transcript [NEW] Speaker 1: Testing the local back end transcription pipeline...sent to the local Python backend running on port 8,000

Declarative E2E Flows (4/4 PASS)

Flow	Test	Result
1	Auth & Session Bootstrap	✅ PASS — kill/relaunch, session restored
2	Live Audio Transcription	✅ PASS — 35 segments, 172 widgets responsive
3	Screen Analysis Settings	✅ PASS — General, Rewind, Privacy rendered
4	Navigation	✅ PASS — All sidebar pages navigated

Overall Verdict

PASS — All 3 PRs (#5374 + #5395 + #5413) verified end-to-end with local backend running combined branch code. Desktop app correctly routes audio through Python backend /v4/listen endpoint.

beastoin · 2026-03-10T13:16:17Z

E2E Video Evidence (5 videos, signed URLs valid 7 days)

Flow 1: Auth & Session Bootstrap

Kill app → relaunch → session restored (no re-auth)
▶ Download flow1-auth-session.mp4

Flow 2: Live Audio Transcription

Desktop app → local backend ws://localhost:8000/v4/listen → Deepgram → 35 segments
▶ Download flow2-transcription.mp4

Flow 3: Screen Analysis Settings

Settings > General, Rewind, Privacy — all rendered
▶ Download flow3-screen-analysis.mp4

Flow 4: Navigation

Dashboard → Chat → Memories → Tasks → Apps → Settings → back
▶ Download flow4-navigation.mp4

Local Backend Transcription Pipeline

App wired to localhost:8000 from combined PR branch. Full end-to-end transcript captured.
▶ Download local-backend-transcription.mp4

beastoin and others added 30 commits March 7, 2026 05:13

Add focus analysis handler for desktop screen_frame messages (#5396)

9da8845

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add FocusResultEvent message type for desktop proactive AI (#5396)

4206abe

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add screen_frame dispatcher to /v4/listen for desktop focus analysis (#…

dc9d765

…5396) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 26 unit tests for desktop focus analysis (#5396)

f102156

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge focus handler (kai backend) into #5396 trunk

e8690fa

Add task extraction handler for desktop screen analysis

616ca11

Add memory extraction handler for desktop screen analysis

a0da068

Add contextual advice handler for desktop screen analysis

8d2b0f8

Add live notes handler for desktop transcript processing

51ed561

Add user profile generation handler for desktop

4e3968f

Add task reranking and deduplication handlers for desktop

a2138ca

Add message event classes for all desktop handler types

ecf3523

Add full desktop dispatcher for screen_frame and text message types

3cac01d

Add unit tests for task extraction handler (18 tests)

5875ba7

Add unit tests for memory extraction handler (14 tests)

aa330be

Add unit tests for advice handler (14 tests)

8c0ebbb

Add unit tests for live notes handler (10 tests)

5a32e04

Add unit tests for profile handler (9 tests)

bbfe287

Add unit tests for task rerank and dedup handlers (16 tests)

10112df

Add all desktop handler tests to test.sh

7dd8fa5

Create BackendProactiveService in ProactiveAssistantsPlugin lifecycle (…

a3a8dfa

…#5396) Start WS connection when monitoring starts, disconnect on stop. Pass service to FocusAssistant (shared for future assistant types). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update FocusTestRunnerWindow for new FocusAssistant init signature (#…

01e323d

…5396) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire MemoryAssistant thin client for Phase 2 (#5396)

cc33cbd

Replace GeminiClient.sendRequest with backendService.extractMemories(). Remove prompt/schema building — all LLM logic now server-side. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire TaskDeduplicationService thin client for Phase 2 (#5396)

2bad746

Replace GeminiClient with backendService.deduplicateTasks(). Remove prompt/schema building, local dedup logic — server handles everything. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire TaskPrioritizationService thin client for Phase 2 (#5396)

0e0492b

Replace GeminiClient with backendService.rerankTasks(). Remove prompt/ schema building, context fetching — server handles reranking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

beastoin and others added 22 commits March 10, 2026 03:15

Wire TaskPrioritizationService thin client for Phase 2 (#5396)

822c3c0

Replace GeminiClient with backendService.rerankTasks(). Remove prompt/ schema building, context fetching — server handles reranking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire LiveNotesMonitor in ProactiveAssistantsPlugin (#5396)

15bf1ec

Add configure(backendService:) call for LiveNotesMonitor alongside other singleton text-based services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Swap dev plist to based-hardware-dev Firebase project

61ee9c3

Update GoogleService-Info-Dev.plist with dev Firebase values: API_KEY, PROJECT_ID, STORAGE_BUCKET, GCM_SENDER_ID, GOOGLE_APP_ID. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge PR #5374: Desktop auth + base integration

805d0df

Merge PR #5395: Desktop STT backend migration

bc2a2a3

Merge PR #5413: Desktop proactive AI thin clients (resolve test.sh co…

54c2a6c

…nflict)

Merge PR #5537: Dev Firebase config for dev builds (resolve test.sh c…

e3cab73

…onflict)

Fix dev.sh to copy dev Firebase plist instead of prod

9e8c3a0

dev.sh builds Omi Dev (com.omi.desktop-dev) but was copying the prod GoogleService-Info.plist. Now uses the same dev plist logic as run.sh. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix reset-and-run.sh to copy dev Firebase plist instead of prod

e3cfbf1

reset-and-run.sh builds Omi Dev (com.omi.desktop-dev) but was copying the prod GoogleService-Info.plist. Now uses the same dev plist logic as run.sh. Fixes #5536 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge PR #5537 update: crash dev builds on missing plist (6d8b57e)

1bb7195

Add GCE VM creation logic for new users (port from Rust agent.rs)

4ec7217

Add unit tests for /v1/agent/vm-ensure and /v1/agent/vm-status (13 te…

4587490

…sts)

Add test_agent_vm.py to test.sh

23d88ee

Fix reviewer issues: move imports to top-level, fail-fast on GCE time…

4b8704e

…out/IP

Fix test in-function imports per CLAUDE.md style rule

f1a912d

Move time import to module level per CLAUDE.md style rule

c9640aa

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

beastoin added 3 commits March 10, 2026 08:37

Add boundary, background-error, and incomplete-payload tests (8 new, …

03ddcd8

…21 total)

Fix test isolation: mock GCE status in incomplete-payload tests

40ae983

Merge PR #5374 update: agent VM creation + status fields (40ae983, re…

0a7b431

…solve test.sh conflict)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify: Combined desktop PRs #5374 #5395 #5413 #5537#5539

Verify: Combined desktop PRs #5374 #5395 #5413 #5537#5539
beastoin wants to merge 170 commits intomainfrom
verify/noa-combined-5374-5395-5413-5537

beastoin commented Mar 10, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

beastoin commented Mar 10, 2026

Uh oh!

beastoin commented Mar 10, 2026

Uh oh!

beastoin commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        key=lambda d: d.total_seconds,
-        reverse=True,
-    )[:5]
-    return FocusStatsResponse(
-        date=date,
-        focused_minutes=focused_count,
-        distracted_minutes=distracted_count,
+    return FocusStatsResponse(
+        date=date,
+        focused_minutes=sum(
+            (s.get('duration_seconds') or 60) // 60
+            for s in sessions if s.get('status') == 'focused'
+        ),
+        distracted_minutes=sum(
+            (s.get('duration_seconds') or 60) // 60
+            for s in sessions if s.get('status') == 'distracted'
+        ),
+        session_count=focused_count + distracted_count,
+        focused_count=focused_count,
+        distracted_count=distracted_count,
+        top_distractions=top_distractions,
+    )

Conversation

beastoin commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Combined Verification Branch

Sub-PRs

Verification Results

Backend Tests

Local Backend E2E (2026-03-10)

Declarative E2E Flows (4/4 PASS)

Swift Build

Mac Mini E2E

Codex Audit

Cross-PR Interaction

Verdict

Uh oh!

greptile-apps bot commented Mar 10, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

beastoin commented Mar 10, 2026

Update: PR #5374 VM endpoint fix merged (40ae983)

What changed

Test Results (updated)

Ancestry

Uh oh!

beastoin commented Mar 10, 2026

Local Backend E2E Verification Update

Setup

Live Transcription Test

Declarative E2E Flows (4/4 PASS)

Overall Verdict

Uh oh!

beastoin commented Mar 10, 2026

E2E Video Evidence (5 videos, signed URLs valid 7 days)

Flow 1: Auth & Session Bootstrap

Flow 2: Live Audio Transcription

Flow 3: Screen Analysis Settings

Flow 4: Navigation

Local Backend Transcription Pipeline

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

beastoin commented Mar 10, 2026 •

edited

Loading

Update: PR #5374 VM endpoint fix merged (`40ae983`)