Skip to content

Fix guardrails TOCTOU races and wire hourly reset daemon#88

Open
odgrim wants to merge 2 commits intomainfrom
abathur/task-05307cec
Open

Fix guardrails TOCTOU races and wire hourly reset daemon#88
odgrim wants to merge 2 commits intomainfrom
abathur/task-05307cec

Conversation

@odgrim
Copy link
Owner

@odgrim odgrim commented Feb 21, 2026

Summary

  • Atomic token check-and-record: Replace separate check_tokens/record_tokens with check_and_record_tokens using a CAS (compare-and-swap) loop on AtomicU64, eliminating the TOCTOU race where concurrent token recording could exceed hourly limits
  • Atomic task/agent check-and-register: Add check_and_register_task and check_and_register_agent that hold the write lock for the full check+insert, replacing the split check_task_start/register_task_start pattern
  • Hourly token reset daemon: Add HourlyResetDaemon following the MemoryDecayDaemon pattern, wired into SwarmOrchestrator.run() lifecycle with proper startup/shutdown
  • dag_executor migration: Update the primary TOCTOU sites in dag_executor.rs to use the new atomic methods

Closes #41
Closes #42

Test plan

  • All 14 guardrails unit tests pass (including 10 new tests for atomic operations)
  • Tests cover: atomic token recording (allowed/warning/blocked/exact-limit), CAS semantics, atomic task registration, atomic agent registration, daemon reset behavior, daemon config
  • Full test suite (992 tests including 12 integration tests) passes
  • cargo check clean
  • All clippy warnings are pre-existing, none introduced by this PR

🤖 Generated with Claude Code

odgrim and others added 2 commits February 21, 2026 00:52
…aemon

- Add atomic check_and_record_tokens using CAS loop on AtomicU64 to
  eliminate the race between check_tokens and record_tokens
- Add atomic check_and_register_task that holds the write lock for
  the entire check+insert, preventing concurrent tasks from exceeding
  the configured limit
- Add atomic check_and_register_agent with the same TOCTOU fix
- Migrate dag_executor to use check_and_register_task instead of the
  separate check/register calls
- Add HourlyResetDaemon that periodically resets the hourly token
  counter, following the MemoryDecayDaemon pattern (AtomicBool stop
  flag, tokio::time::interval tick)
- Retain old check/register methods (marked deprecated in docs) for
  backward compatibility
- Add comprehensive tests for all new atomic operations and the reset
  daemon

Fixes #41, fixes #42

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…grate dag_executor

- Replace deprecated g.record_tokens() in dag_executor with atomic
  g.check_and_record_tokens(), handling Blocked/Warning results with
  tracing — this was the primary TOCTOU site the fix was meant to address
- Wire HourlyResetDaemon into SwarmOrchestrator: add field, spawn at
  startup (after decay daemon), stop during graceful shutdown
- Add #[deprecated] attributes to old check_task_start, register_task_start,
  check_agent_spawn, register_agent_spawn, check_tokens, record_tokens
- Suppress deprecation warnings in tests that intentionally cover old API

Closes #41, closes #42

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@odgrim odgrim changed the title Fix TOCTOU races in guardrails and add hourly token reset Fix guardrails TOCTOU races and wire hourly reset daemon Feb 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant