v0.9.14: dogfooding robustness pass — budget reset, loop/security FP fixes, dead-proxy safety#9
Closed
codehippie1 wants to merge 9 commits into
Closed
v0.9.14: dogfooding robustness pass — budget reset, loop/security FP fixes, dead-proxy safety#9codehippie1 wants to merge 9 commits into
codehippie1 wants to merge 9 commits into
Conversation
Correct OpenAI gpt-5.x prices (were 2-7x understated), add the Gemini 3.x generation and still-billable Anthropic legacy models, fix Gemini 2.5 cache rates, and update context windows for the 1M generation. Adds a table-ordering invariant test plus per-provider coverage.
…etector death-spiral fix BudgetTracker is now day- and month-stamped with lazy rollover (restart- and clock-change-proof) so a long-running daemon no longer accumulates across days and 429s everything. Monthly cap is now enforced. budget.enforce_on_plan (default off) keeps the dollar cap from blocking subscription traffic, which is notional. Loop detector splits into a read-only pre-forward peek plus a tee-side record-on-2xx, so blocked 429s and failed-request retries can't refill the window; hash keyed by method+provider+path, GET/body-less skipped, Retry-After added. Also filters empty deny rules at ruleset construction.
Editor/content tool args and tool_results now get data checks only (a Write or note that merely mentions ~/.ssh no longer 403s); shell tools keep full command checks, path-shaped content args still path-checked. UNC match requires a real share root (escaped Windows paths pass) with WSL/device whitelist; rm literals dropped in favor of the shape detector (scoped deletes pass), tokenizer splits JSON-glued tokens; AWS example keys exempted; match location surfaced. Responses-API input[] round scoping. /Volumes/ dropped. Adds sk-proj-/ASIA/gh[pousr]_/glpat- patterns. Fail-open scan now logs.
…essages, $0-recording guards Shared HTTP client with connect/keepalive/read timeouts (no more hangs on VPN flips or stalled streams). Client disconnect now drops the upstream stream instead of draining the full billed generation, recording a 499 partial. Every block self-identifies as Burnwall with what/where, escalating escape hatches, Retry-After, and provider-correct error JSON; new 'burnwall report-bug' writes a sanitized local report. OpenAI Responses API now parses (Codex no longer records $0), all-zero usage treated as a parse failure, and unknown models warn once instead of silently costing $0. Cache-projection write moved off the pre-forward hot path into the tee.
…e + corpus-rescan fixes busy_timeout set before the WAL switch and duplicate-column tolerated (first-open races); PRAGMA user_version stamped and a newer-than-supported DB refused. daily_totals off-by-one corrected to match sibling window queries (history --days 7 now shows 7), history clamps days>=1. Logscrape prunes by file mtime and streams lines instead of slurping whole multi-MB session files, so status/waste no longer re-parse the entire corpus each call. Adds total_cost_for_month for the budget cap.
…rdening MCP watcher uses prose-safe scoping (only shell-tool args get command checks) and strips accept-encoding so gzip can't blind the firewall. Rug-pull re-pend now keys on the schema fingerprint (description-only changes warn, don't revoke); approval/deny 403s are proper JSON-RPC errors naming the remediation command; tools/list reads are timeout-bounded. Audit: a lost/changed key is detected and refuses to seal (with a rekey command) instead of forever reporting TAMPERED; seal is transactional; SARIF results carry locations; query strings stripped before persist; pack ids validated; rules fetch compares against the TOFU pin.
…ty, honest status Env files are liveness-gated so a crashed/rebooted proxy degrades a shell to DIRECT instead of breaking every tool; the daemon child pauses routing on graceful exit. Every surface (ribbon, status, VS Code bar) shows a loud 'proxy down' when routed at a dead port. PowerShell now gets a persistent CurrentUserAllHosts profile hook (no longer a silent dead end), bash chains into login profiles. PID files carry an image-name identity check so stop can't kill an innocent process and autostart can't bail against a reused PID. config doctor prints a per-shell routing matrix; daemon writes a size-capped log file; upgrade/install use one canonical dir and a PATH-resolved restart. Statusline msg is turn-aware, watch annotates idle data, plan window suppressed past reset, warning-grade plan status no longer reads as throttled, combined-today no longer double-counts proxied traffic.
Raw-TCP fake upstream exercising the paths idealized tests missed: SSE delivered one byte per flush round-trips intact and records usage; a stalled upstream is bounded by read_timeout instead of hanging; a client disconnect mid-stream leaves the proxy responsive. Registers the torture_test and audit_test targets.
…fixes, dead-proxy safety
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A real-world robustness pass driven by dogfooding: a multi-agent review of every feature, focused on the failure modes that make a tool freeze, falsely block, or mislead — the kind that trigger an uninstall — followed by fixes and tests across every wave.
Fixed
Retry-After, and the window is keyed per method/provider/path so unrelated requests don't collide.~/.sshin a README) no longer 403s — only shell-tool arguments get command checks. Windows paths in tool arguments are no longer mistaken for network mounts, scoped deletes likerm -rf /tmp/xpass, and well-known documentation/example keys are exempt. Blocks now explain what was caught and how to proceed, andburnwall report-bugwrites a sanitized local report for false positives.Changed
Validation
Base branch is the previous release line.