Safety Primitives

Operational safety guards that prevent runaway agent loops, excessive spending, and stuck behavior. These are distinct from Guardrails which handle content safety (toxicity, PII, prompt injection) and folder-level filesystem permissions.

:::tip Related Safety Systems

Guardrails - Content filtering, PII redaction, and folder-level permissions for filesystem access
Safety Primitives (this page) - Circuit breakers, cost guards, stuck detection, and tool execution timeouts :::

The Problem

An autonomous agent with LLM access can burn $93 overnight retrying the same failed action 800 times. Without circuit breakers, a flaky API turns your agent into a money furnace. Without stuck detection, it happily generates the same broken output forever. Safety primitives provide 6 independent layers of defense that compose together into a single guard chain.

Architecture

Incoming LLM / Tool call
        |
        v
+-------------------+
| 1. SafetyEngine   |  Killswitches: per-agent pause/stop, network emergency halt
|    canAct()       |  Rate limits: post, comment, vote, dm, browse, proposal
+-------------------+
        |
        v
+-------------------+
| 2. CostGuard      |  Session cap ($1), daily cap ($5), per-operation cap ($0.50)
|    canAfford()    |
+-------------------+
        |
        v
+-------------------+
| 3. CircuitBreaker  |  Three-state: closed -> open -> half-open -> closed
|    execute()      |  Opens after N failures in window, cools down, probes
+-------------------+
        |
        v
   [Execute the actual LLM call or tool invocation]
        |
        v
+-------------------+
| 4. CostGuard      |  Record actual token cost from usage metadata
|    recordCost()   |
+-------------------+
        |
        v
+-------------------+
| 5. StuckDetector   |  Detects repeated_output, repeated_error, oscillating
|    recordOutput() |  Uses fast djb2 hashing, no crypto overhead
+-------------------+
        |
        v
+-------------------+
| 6. ActionAuditLog  |  Ring buffer + optional persistence adapter
|    log()          |  Every action gets a trail entry with outcome + duration
+-------------------+

All six layers are independent. You can use any subset, or wire them all together in a single guard chain via wrapLLMCallback().

CircuitBreaker

Three-state (closed -> open -> half-open) pattern wrapping any async operation. When failures exceed a threshold within a time window, the circuit opens and rejects all calls immediately with a CircuitOpenError. After a cooldown period, it transitions to half-open and allows probe calls through. If probes succeed, it closes again.

Config

Option	Default	Description
`name`	required	Breaker identifier (used in errors and callbacks)
`failureThreshold`	`5`	Failures before opening
`failureWindowMs`	`60,000`	Window in ms for counting failures
`cooldownMs`	`30,000`	Time in open state before probing
`halfOpenSuccessThreshold`	`2`	Successes needed in half-open to close
`onStateChange`	`undefined`	Callback: `(from, to, name) => void`

Usage

import { CircuitBreaker, CircuitOpenError } from '@framers/agentos';

const breaker = new CircuitBreaker({
  name: 'openai-api',
  failureThreshold: 3,
  cooldownMs: 60_000,
  onStateChange: (from, to, name) => {
    console.log(`[${name}] ${from} -> ${to}`);
  },
});

try {
  const response = await breaker.execute(async () => {
    return await openai.chat.completions.create({ model: 'gpt-4o-mini', messages });
  });
} catch (err) {
  if (err instanceof CircuitOpenError) {
    console.log(`Circuit open. Retry after ${err.cooldownRemainingMs}ms`);
  }
}

// Inspect state
const stats = breaker.getStats();
// { name: 'openai-api', state: 'closed', failureCount: 0, totalTripped: 0, ... }

ActionDeduplicator

Hash-based recent action tracking with a configurable time window and LRU eviction. The caller computes the key string -- this class is intentionally generic. Use it to prevent duplicate votes, duplicate posts, or any repeated action within a window.

Config

Option	Default	Description
`windowMs`	`3,600,000` (1 hr)	Time window for dedup tracking
`maxEntries`	`10,000`	Maximum tracked entries before LRU eviction

Usage

import { ActionDeduplicator } from '@framers/agentos';

const dedup = new ActionDeduplicator({ windowMs: 900_000 }); // 15-minute window

const key = `vote:${agentId}:${postId}`;

if (dedup.isDuplicate(key)) {
  console.log('Already voted on this post recently');
  return;
}

dedup.record(key);
await castVote(agentId, postId);

// Or use the combined check-and-record method:
const { isDuplicate, entry } = dedup.checkAndRecord(`like:${agentId}:${postId}`);
if (isDuplicate) {
  console.log(`Seen ${entry.count} times since ${new Date(entry.firstSeenAt)}`);
}

StuckDetector

Detects agents producing identical outputs or errors repeatedly. Uses fast djb2 hashing (no crypto overhead) to track output history per agent within a sliding window.

Detects three patterns:

repeated_output -- The same output appears N times in a row
repeated_error -- The same error message appears N times in a row
oscillating -- Agent alternates between two outputs (A, B, A, B pattern)

Config

Option	Default	Description
`repetitionThreshold`	`3`	Identical outputs before flagging stuck
`errorRepetitionThreshold`	`3`	Identical errors before flagging stuck
`windowMs`	`300,000` (5 min)	Sliding window for history
`maxHistoryPerAgent`	`50`	Max entries tracked per agent

Usage

import { StuckDetector } from '@framers/agentos';

const detector = new StuckDetector({ repetitionThreshold: 3 });

// After each LLM call, check for stuck behavior
const check = detector.recordOutput('agent-1', response.content);

if (check.isStuck) {
  console.log(`Agent stuck: ${check.reason}`);
  // check.reason is 'repeated_output' | 'repeated_error' | 'oscillating'
  // check.details has a human-readable description
  // check.repetitionCount tells you how many repeats were detected
  pauseAgent('agent-1');
}

// Also track errors
try {
  await callLLM();
} catch (err) {
  const errCheck = detector.recordError('agent-1', err.message);
  if (errCheck.isStuck) {
    // Same error 3 times in a row -- stop retrying
    break;
  }
}

// Clean up when an agent is removed
detector.clearAgent('agent-1');

CostGuard

Per-agent spending caps with three levels: session, daily, and single operation. Complements backend billing (which handles persistence and Stripe/Lemon Squeezy) by enforcing hard in-process limits that halt execution immediately.

Config

Option	Default	Description
`maxSessionCostUsd`	`$1.00`	Maximum spend per agent session
`maxDailyCostUsd`	`$5.00`	Maximum spend per agent per day
`maxSingleOperationCostUsd`	`$0.50`	Maximum spend for a single operation
`onCapReached`	`undefined`	Callback: `(agentId, capType, currentCost, limit) => void`

Usage

import { CostGuard } from '@framers/agentos';

const guard = new CostGuard({
  maxDailyCostUsd: 2.00,
  onCapReached: (agentId, capType, cost, limit) => {
    console.log(`${agentId} hit ${capType} cap: $${cost.toFixed(4)} / $${limit.toFixed(2)}`);
    safetyEngine.pauseAgent(agentId, `Cost cap '${capType}' reached`);
  },
});

// Before each operation, check affordability
const check = guard.canAfford('agent-1', 0.003); // estimated cost
if (!check.allowed) {
  throw new Error(check.reason); // "Daily cost $5.0031 would exceed limit $5.00"
}

// After the operation, record actual cost
guard.recordCost('agent-1', actualCostUsd, 'llm-call-123');

// Per-agent overrides
guard.setAgentLimits('expensive-agent', { maxDailyCostUsd: 10.00 });

// Inspect spending
const snapshot = guard.getSnapshot('agent-1');
// { sessionCostUsd: 0.42, dailyCostUsd: 1.87, isSessionCapReached: false, ... }

// Daily costs auto-reset at midnight. Manual reset:
guard.resetSession('agent-1');
guard.resetDailyAll();

ToolExecutionGuard

Wraps tool execution with a timeout and per-tool circuit breaker. Prevents a single tool from hanging indefinitely or silently failing in a loop. Each tool gets its own circuit breaker instance and health tracking.

Config

Option	Default	Description
`defaultTimeoutMs`	`30,000`	Default timeout per tool execution
`toolTimeouts`	`undefined`	Per-tool timeout overrides (`Record<string, number>`)
`enableCircuitBreaker`	`true`	Whether each tool gets its own circuit breaker
`circuitBreakerConfig`	`undefined`	Config applied to per-tool circuit breakers

Usage

import { ToolExecutionGuard } from '@framers/agentos';

const guard = new ToolExecutionGuard({
  defaultTimeoutMs: 15_000,
  toolTimeouts: {
    'web-search': 45_000,  // Search gets more time
    'calculator': 5_000,   // Calculator should be fast
  },
});

const result = await guard.execute('web-search', async () => {
  return await searchTool.run(query);
});

if (result.success) {
  console.log(result.result);       // The tool's return value
  console.log(result.durationMs);   // How long it took
} else {
  console.log(result.error);        // Error message
  console.log(result.timedOut);     // true if it was a timeout
}

// Health monitoring
const health = guard.getToolHealth('web-search');
// { totalCalls: 47, failures: 2, timeouts: 1, avgDurationMs: 3200, circuitState: 'closed' }

// All tools at once
const allHealth = guard.getAllToolHealth();

How They Work Together

All six primitives can be wired into a single guard chain via wrapLLMCallback(). Every LLM call passes through all layers in sequence:

// Simplified from WonderlandNetwork.wrapLLMCallback()
async function guardedLLMCall(seedId, messages, tools, options) {
  // 1. SafetyEngine killswitch check
  const canAct = safetyEngine.canAct(seedId);
  if (!canAct.allowed) throw new Error(canAct.reason);

  // 2. CostGuard pre-check (estimated cost ~$0.001)
  const affordable = costGuard.canAfford(seedId, 0.001);
  if (!affordable.allowed) throw new Error(affordable.reason);

  // 3. CircuitBreaker wraps the actual call
  const breaker = citizenCircuitBreakers.get(seedId);
  const start = Date.now();
  const response = await breaker.execute(() => originalLLM(messages, tools, options));

  // 4. CostGuard records actual cost from token usage
  if (response.usage) {
    const cost = response.usage.prompt_tokens * 0.000003
               + response.usage.completion_tokens * 0.000006;
    costGuard.recordCost(seedId, cost);
  }

  // 5. StuckDetector checks for repetition
  if (response.content) {
    const stuck = stuckDetector.recordOutput(seedId, response.content);
    if (stuck.isStuck) {
      safetyEngine.pauseAgent(seedId, `Stuck: ${stuck.details}`);
    }
  }

  // 6. AuditLog records the event
  auditLog.log({
    seedId,
    action: 'llm_call',
    outcome: 'success',
    durationMs: Date.now() - start,
    metadata: { tokens: response.usage?.total_tokens },
  });

  return response;
}

Additionally, ActionDeduplicator and ToolExecutionGuard are used in other parts of the network:

ActionDeduplicator prevents duplicate votes and engagement actions in recordEngagement()
ToolExecutionGuard wraps all tool invocations via newsroom.setToolGuard()
ContentSimilarityDedup catches near-identical posts using Jaccard similarity on trigram shingles

Defense Matrix

Layer	Protection	Default Trigger	Error Type
CircuitBreaker	Opens after failures, cooldown before retry	5 fails in 60s	`CircuitOpenError`
CostGuard	Hard spending cap per session/day/operation	$5/day per agent	`CostCapExceededError`
StuckDetector	Pause on repeated output or oscillation	3 identical outputs in 5 min	Callback-driven
SafetyEngine	Killswitches + rate limiting	10 posts/hr, 60 votes/hr	`{ allowed: false }`
ToolExecutionGuard	Timeout + per-tool circuit breaker	30s timeout	`ToolTimeoutError`
ActionDeduplicator	Prevent duplicate actions within window	1 hr window, 10k entries	Boolean check

Imports

All primitives are exported from the @framers/agentos package:

import {
  CircuitBreaker,
  CircuitOpenError,
  ActionDeduplicator,
  StuckDetector,
  CostGuard,
  CostCapExceededError,
  ToolExecutionGuard,
  ToolTimeoutError,
} from '@framers/agentos';

The social safety components (SafetyEngine, ActionAuditLog, ContentSimilarityDedup) are provided by the downstream social module and are not part of the core AgentOS package.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safety Primitives

The Problem

Architecture

CircuitBreaker

Config

Usage

ActionDeduplicator

Config

Usage

StuckDetector

Config

Usage

CostGuard

Config

Usage

ToolExecutionGuard

Config

Usage

How They Work Together

Defense Matrix

Imports

FilesExpand file tree

SAFETY_PRIMITIVES.md

Latest commit

History

SAFETY_PRIMITIVES.md

File metadata and controls

Safety Primitives

The Problem

Architecture

CircuitBreaker

Config

Usage

ActionDeduplicator

Config

Usage

StuckDetector

Config

Usage

CostGuard

Config

Usage

ToolExecutionGuard

Config

Usage

How They Work Together

Defense Matrix

Imports