Background agent improvements: retry, env scrub, tasks, budget, reactive triggers#108
Merged
priyanshujain merged 42 commits intomasterfrom Mar 20, 2026
Merged
Background agent improvements: retry, env scrub, tasks, budget, reactive triggers#108priyanshujain merged 42 commits intomasterfrom
priyanshujain merged 42 commits intomasterfrom
Conversation
NextRetryAt was dead code — River's Worker interface expects NextRetry, so the custom 15-minute retry backoff was never active.
Auth/context-window errors delay retry by 1 year (effectively never). 429 rate limits retry after 30min, 5xx after 10min, default stays 15min.
Auth and context-window errors are now immediately cancelled instead of retrying. Uses job.MaxAttempts instead of hardcoded 2 for retry check.
Gives scheduled tasks one more retry attempt before permanent failure notification, working alongside the new error-classified retry backoff.
Adds scrubEnv function that strips API keys, tokens, secrets, and passwords from environment variables passed to external agent processes. Uses allowlist for safe vars and pattern matching for sensitive ones.
Previously only Claude had CLAUDECODE stripped. Now all agents (Gemini, Codex, Claude) get sensitive env vars removed via scrubEnv.
Dual SQLite/Postgres schema for the tasks table. Stores task ID, prompt, agent kind, status, timestamps, output, and error.
Insert, SetCompleted, SetFailed, Get, List, CountRunning, DeleteOlderThan. Get returns (nil, nil) for not-found. List orders by started_at DESC.
Tests: InsertAndGet, SetCompleted, SetFailed, GetNotFound, List ordering, CountRunning, DeleteOlderThan, Migrate idempotent.
Cleanup deletes completed/failed tasks older than 7 days. Called on tracker construction (best-effort, errors logged but not propagated).
Adds TasksConfig struct, TasksDataDSN() method (default ~/.obk/tasks/data.db), and wires into Default() and applyDefaults().
TaskTracker now optionally writes to a database. Start/Complete/Fail write-through to DB. Get falls through to DB for cross-session lookup. List reads from DB when available. DB errors are logged, never block.
Tests cross-session Get/List via DB fallthrough and write-through verification (Start writes to DB, Complete updates DB).
Adds openTaskTracker helper that opens tasks DB and creates a persistent tracker. Falls back to in-memory on DB error. Defers tracker.Close().
Tracker lives for lifetime of SessionManager, persisting tasks across session restarts. Falls back to in-memory on DB error.
DelegateTaskConfig now accepts AuditLogger. Async task completion/failure is logged with context="delegated". Scheduled task worker records results in tasks DB when TasksDB is set.
Moves model pricing data from internal/cli/usage.go to provider package. Adds EstimateCost function with prefix-matching for versioned model names.
Replaces local modelPricing map and estimateCost with thin wrapper around provider.EstimateCost. Removes duplicate pricing data.
BudgetTracker wraps UsageRecorder, accumulates cost per LLM call, and returns error when max budget is exceeded. 0 means unlimited.
After each LLM call, checks the budget. On exceeded, returns partial text and error for graceful degradation.
Add model_tier and max_budget_usd to INSERT/SELECT/scan operations. Remove unused alterBudget const from schema.go.
Add ModelTier and MaxBudgetUSD to ScheduledTaskArgs. Use Router.Resolve to pick model by tier (default "fast"). Wire BudgetTracker when budget > 0.
Add model_tier and max_budget_usd params to CreateScheduleTool schema. Default tier to "fast". Show tier/budget in list output. Pass both fields through ScheduledTaskArgs in daemon/scheduler.go.
Add trigger_source, trigger_query, last_trigger_id to INSERT/SELECT/scan. Add ListEnabledReactive and UpdateLastTriggerID functions.
Pass SyncNotifier through daemon startup to applenotes, imessage, whatsapp, and gmail sync. Each notifies on successful sync. Scheduler receives the notifier for reactive trigger evaluation.
Listen on SyncNotifier channel for sync completion signals. On each signal, load reactive schedules for that source, evaluate trigger queries against source DB, enqueue matching tasks with augmented prompt, and update watermark to prevent re-firing.
Add reactive to type enum with trigger_source and trigger_query params. Validate source and query on creation. Show reactive info in list output.
Add TestSpec_ReactiveEmailTrigger (full pipeline: seed emails, create reactive schedule, trigger check, River job, LLM, pusher, watermark). Add TestSpec_ReactiveNoMatchDoesNotFire (negative test). Add CheckReactiveTriggersForTest helper to daemon/scheduler.go. Add "tasks" to createSourceDirs in local_fixture.go.
Cover ListEnabledReactive, UpdateLastTriggerID, MarkCompleted, tasks.Cleanup, and BudgetTracker.Total to close coverage gaps.
…ent SQL injection The denylist approach was bypassable (missing UNION, SELECT, ATTACH, etc.). Now uses an allowlist of permitted column names per source and SQL keywords. String literals are stripped before checking to avoid false positives.
WhatsApp uses streaming sync (Follow: true) that blocks until context cancellation. The notifier was only firing on exit, so reactive triggers for WhatsApp data never evaluated during normal operation.
The identical openTaskTracker function existed in both internal/cli/chat.go and channel/telegram/session.go. Moved DB-opening logic to tools package.
The %v formatting of map[string]string produced unreadable Go syntax. Now formats as sorted key-value pairs (e.g. "from_addr: x | subject: y").
Auth and context-window errors are already cancelled via river.JobCancel in Work(), so NextRetry is never called for those. Removed unreachable switch cases and updated tests accordingly.
Connection strings (DATABASE_URL, REDIS_URL, MONGODB_URI, etc.) and private keys were not caught by the existing patterns.
Verifies the agent loop stops on budget exceeded, returns partial text with the error, and passes through normally when under budget.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Six improvements to background agent capabilities, implemented across 34 atomic commits:
NextRetryAt→NextRetrybug in River worker. Classify errors (auth, rate-limit, transient) for adaptive retry timing. Cancel non-retryable jobs immediately withriver.JobCancel.service/tasks/package with SQLite/Postgres schema.TaskTrackergets write-through DB backing for cross-session task visibility. 7-day auto-cleanup.BudgetTrackerwrapsUsageRecorderto enforce per-task cost limits. Schedules carrymodel_tier(default: fast) andmax_budget_usd. Shared pricing extracted toprovider/pricing.go.reactiveschedule type. SQL WHERE clauses evaluated against synced data (gmail, whatsapp, imessage, applenotes) with watermark tracking.SyncNotifiersignals trigger evaluation after each sync cycle.Review fixes (post-review)
Seven fixes from PR review:
DROP,DELETE, etc.) with allowlist of permitted column names per source + allowed SQL keywords. String literals stripped before validation. BlocksUNION SELECT, subqueries,ATTACH,PRAGMA,load_extension, unknown columns.notifier.Notify("whatsapp")was only called on goroutine exit (streaming sync). Added periodic 30s ticker to match cadence of other sync sources.openTaskTracker— Identical function ininternal/cli/chat.goandchannel/telegram/session.goextracted totools.OpenPersistentTaskTracker.%vonmap[string]stringproduced Go map syntax. Now formats as sortedkey: value | key: valuepairs.NextRetry— Auth/context-window error cases were unreachable (already cancelled inWork()viariver.JobCancel). Removed.DATABASE_URL,REDIS_URL,MONGODB_URI,_DSN,_PRIVATE_KEYto sensitive patterns.TestLoop_BudgetExceededandTestLoop_BudgetNotExceededto verify agent loop stops on budget exceeded with partial text.Test plan
go test ./service/scheduler/— store CRUD, reactive queries, trigger validation (allowlist)go test ./service/tasks/— insert, complete, fail, list, cleanupgo test ./agent/— budget tracker (incl. agent loop integration), persistent task tracker, env scrubbinggo test ./daemon/jobs/— sync notifier, scheduled task retry classificationgo test ./provider/— shared pricing estimationOBK_TEST_PROVIDER=gemini go test ./spectest/ -run Reactive -v)