Skip to content

v0.9.14: dogfooding robustness pass — budget reset, loop/security FP fixes, dead-proxy safety#9

Closed
codehippie1 wants to merge 9 commits into
release/v0.9.13from
release/v0.9.14
Closed

v0.9.14: dogfooding robustness pass — budget reset, loop/security FP fixes, dead-proxy safety#9
codehippie1 wants to merge 9 commits into
release/v0.9.13from
release/v0.9.14

Conversation

@codehippie1

Copy link
Copy Markdown
Contributor

A real-world robustness pass driven by dogfooding: a multi-agent review of every feature, focused on the failure modes that make a tool freeze, falsely block, or mislead — the kind that trigger an uninstall — followed by fixes and tests across every wave.

Fixed

  • Daily budget now resets at midnight. A long-running proxy used to accumulate spend across days and eventually 429 every request even when the day's real spend was small. The counter is now day- and month-aware (restart- and clock-change-proof), and the monthly cap is actually enforced.
  • Loop detection no longer wedges on retries. A blocked request — and a client's automatic retry of it, or a retry after a provider outage — no longer feeds the loop-detection window, so a transient blip can't trap a session in a permanent 429 loop. Blocks carry Retry-After, and the window is keyed per method/provider/path so unrelated requests don't collide.
  • Fewer false security blocks. Writing or discussing a file that merely mentions a sensitive path (e.g. ~/.ssh in a README) no longer 403s — only shell-tool arguments get command checks. Windows paths in tool arguments are no longer mistaken for network mounts, scoped deletes like rm -rf /tmp/x pass, and well-known documentation/example keys are exempt. Blocks now explain what was caught and how to proceed, and burnwall report-bug writes a sanitized local report for false positives.
  • The proxy no longer hangs on a stalled or unreachable upstream, and cancelling a request (Esc) stops the upstream instead of billing the full response.
  • Accurate cost capture for more tools. OpenAI's Responses API (used by Codex) is parsed instead of silently recording $0, unknown models warn instead of recording $0, and the cross-tool "today" total no longer double-counts traffic that went through the proxy.

Changed

  • A crashed or stopped proxy no longer breaks your terminals. Shell routing is liveness-gated: if the proxy isn't running, a new shell talks directly to the provider (unprotected but working) instead of failing to connect. Every status surface shows a clear "proxy down" warning when routing points at a dead port. PowerShell now gets persistent routing like the other shells.
  • Plan-aware budgeting: on a flat-rate subscription, the dollar cap is advisory (tracked and warned, not blocked) unless explicitly enabled.
  • Hardening across MCP (prose-safe scanning, clearer approval errors), the audit chain (lost-key detection), storage (schema versioning, race-safe migrations), and the daemon (a real log file, PID identity checks). Install/upgrade now use one canonical directory.

Validation

  • 628 Rust tests + 17 extension tests pass; clippy clean across all targets.
  • New torture-proxy suite covers streaming, stalled-upstream timeout, and client-disconnect paths the earlier tests missed; regression tests sit alongside each fix.
  • Released and installed via the official installer end-to-end.

Base branch is the previous release line.

Correct OpenAI gpt-5.x prices (were 2-7x understated), add the Gemini 3.x generation and still-billable Anthropic legacy models, fix Gemini 2.5 cache rates, and update context windows for the 1M generation. Adds a table-ordering invariant test plus per-provider coverage.
…etector death-spiral fix

BudgetTracker is now day- and month-stamped with lazy rollover (restart- and clock-change-proof) so a long-running daemon no longer accumulates across days and 429s everything. Monthly cap is now enforced. budget.enforce_on_plan (default off) keeps the dollar cap from blocking subscription traffic, which is notional. Loop detector splits into a read-only pre-forward peek plus a tee-side record-on-2xx, so blocked 429s and failed-request retries can't refill the window; hash keyed by method+provider+path, GET/body-less skipped, Retry-After added. Also filters empty deny rules at ruleset construction.
Editor/content tool args and tool_results now get data checks only (a Write or note that merely mentions ~/.ssh no longer 403s); shell tools keep full command checks, path-shaped content args still path-checked. UNC match requires a real share root (escaped Windows paths pass) with WSL/device whitelist; rm literals dropped in favor of the shape detector (scoped deletes pass), tokenizer splits JSON-glued tokens; AWS example keys exempted; match location surfaced. Responses-API input[] round scoping. /Volumes/ dropped. Adds sk-proj-/ASIA/gh[pousr]_/glpat- patterns. Fail-open scan now logs.
…essages, $0-recording guards

Shared HTTP client with connect/keepalive/read timeouts (no more hangs on VPN flips or stalled streams). Client disconnect now drops the upstream stream instead of draining the full billed generation, recording a 499 partial. Every block self-identifies as Burnwall with what/where, escalating escape hatches, Retry-After, and provider-correct error JSON; new 'burnwall report-bug' writes a sanitized local report. OpenAI Responses API now parses (Codex no longer records $0), all-zero usage treated as a parse failure, and unknown models warn once instead of silently costing $0. Cache-projection write moved off the pre-forward hot path into the tee.
…e + corpus-rescan fixes

busy_timeout set before the WAL switch and duplicate-column tolerated (first-open races); PRAGMA user_version stamped and a newer-than-supported DB refused. daily_totals off-by-one corrected to match sibling window queries (history --days 7 now shows 7), history clamps days>=1. Logscrape prunes by file mtime and streams lines instead of slurping whole multi-MB session files, so status/waste no longer re-parse the entire corpus each call. Adds total_cost_for_month for the budget cap.
…rdening

MCP watcher uses prose-safe scoping (only shell-tool args get command checks) and strips accept-encoding so gzip can't blind the firewall. Rug-pull re-pend now keys on the schema fingerprint (description-only changes warn, don't revoke); approval/deny 403s are proper JSON-RPC errors naming the remediation command; tools/list reads are timeout-bounded. Audit: a lost/changed key is detected and refuses to seal (with a rekey command) instead of forever reporting TAMPERED; seal is transactional; SARIF results carry locations; query strings stripped before persist; pack ids validated; rules fetch compares against the TOFU pin.
…ty, honest status

Env files are liveness-gated so a crashed/rebooted proxy degrades a shell to DIRECT instead of breaking every tool; the daemon child pauses routing on graceful exit. Every surface (ribbon, status, VS Code bar) shows a loud 'proxy down' when routed at a dead port. PowerShell now gets a persistent CurrentUserAllHosts profile hook (no longer a silent dead end), bash chains into login profiles. PID files carry an image-name identity check so stop can't kill an innocent process and autostart can't bail against a reused PID. config doctor prints a per-shell routing matrix; daemon writes a size-capped log file; upgrade/install use one canonical dir and a PATH-resolved restart. Statusline msg is turn-aware, watch annotates idle data, plan window suppressed past reset, warning-grade plan status no longer reads as throttled, combined-today no longer double-counts proxied traffic.
Raw-TCP fake upstream exercising the paths idealized tests missed: SSE delivered one byte per flush round-trips intact and records usage; a stalled upstream is bounded by read_timeout instead of hanging; a client disconnect mid-stream leaves the proxy responsive. Registers the torture_test and audit_test targets.
@codehippie1

Copy link
Copy Markdown
Contributor Author

Superseded: this release branch was reconciled onto main via #12 (merge commit 6b9e53e). Every commit from this branch is contained in main (verified by patch-id), so there is nothing left to merge here. Closing as landed-upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant