Performance Monitoring Guide

Comprehensive guide to monitoring, debugging, and optimizing Claude Code and oh-my-claudecode performance.

Overview
Built-in Monitoring
HUD Integration
Debugging Techniques
External Resources
Best Practices
Troubleshooting

Overview

oh-my-claudecode provides comprehensive monitoring capabilities for tracking agent performance, token usage, costs, and identifying bottlenecks in multi-agent workflows. This guide covers both built-in tools and external resources for monitoring Claude's performance.

What You Can Monitor

Metric	Tool	Granularity
Agent lifecycle	Agent Observatory	Per-agent
Tool timing	Session Replay	Per-tool call
Token usage	Analytics System	Per-session/agent
API costs	Analytics System	Per-session/daily/monthly
File ownership	Subagent Tracker	Per-file
Parallel efficiency	Observatory	Real-time

Built-in Monitoring

Agent Observatory

The Agent Observatory provides real-time visibility into all running agents, their performance metrics, and potential issues.

Accessing the Observatory

The observatory is automatically displayed in the HUD when agents are running. You can also query it programmatically:

import { getAgentObservatory } from 'oh-my-claudecode/hooks/subagent-tracker';

const obs = getAgentObservatory(process.cwd());
console.log(obs.header);  // "Agent Observatory (3 active, 85% efficiency)"
obs.lines.forEach(line => console.log(line));

Observatory Output

Agent Observatory (3 active, 85% efficiency)
🟢 [a1b2c3d] executor 45s tools:12 tokens:8k $0.15 files:3
🟢 [e4f5g6h] document-specialist 30s tools:5 tokens:3k $0.08
🟡 [i7j8k9l] architect 120s tools:8 tokens:15k $0.42
   └─ bottleneck: Grep (2.3s avg)
⚠ architect: Cost $0.42 exceeds threshold

Status Indicators

Icon	Meaning
🟢	Healthy - agent running normally
🟡	Warning - intervention suggested
🔴	Critical - stale agent (>5 min)

Key Metrics

Metric	Description
`tools:N`	Number of tool calls made
`tokens:Nk`	Approximate token usage (thousands)
`$X.XX`	Estimated cost in USD
`files:N`	Files being modified
`bottleneck`	Slowest repeated tool operation

Token & Cost Analytics

OMC automatically tracks token usage and costs across all sessions.

CLI Commands

# View everything (default dashboard)
omc

# View daily/weekly/monthly cost reports
omc cost daily
omc cost weekly
omc cost monthly

# View session history
omc sessions

# Agent observability is shown in HUD/replay logs
# (legacy standalone agent-breakdown command was removed)

# Export data
omc export cost csv ./costs.csv

Real-time HUD Display

Enable the analytics preset for detailed cost tracking in your status line:

{
  "omcHud": {
    "preset": "analytics"
  }
}

This shows:

Session cost and tokens
Cost per hour
Cache efficiency (% of tokens from cache)
Budget warnings (>$2 warning, >$5 critical)

Backfill Historical Data

Analyze historical Claude Code transcripts:

# Preview available transcripts
omc backfill --dry-run

# Backfill all transcripts
omc backfill

# Backfill specific project
omc backfill --project "*my-project*"

# Backfill recent only
omc backfill --from "2026-01-01"

Session Replay

Session replay records agent lifecycle events as JSONL for post-session analysis and timeline visualization.

Event Types

Event	Description
`agent_start`	Agent spawned with task info
`agent_stop`	Agent completed/failed with duration
`tool_start`	Tool invocation begins
`tool_end`	Tool completes with timing
`file_touch`	File modified by agent
`intervention`	System intervention triggered

Replay Files

Replay data is stored at: .omc/state/agent-replay-{sessionId}.jsonl

Each line is a JSON event:

{"t":0.0,"agent":"a1b2c3d","agent_type":"executor","event":"agent_start","task":"Implement feature","parent_mode":"ultrawork"}
{"t":5.2,"agent":"a1b2c3d","event":"tool_start","tool":"Read"}
{"t":5.4,"agent":"a1b2c3d","event":"tool_end","tool":"Read","duration_ms":200,"success":true}

Analyzing Replay Data

import { getReplaySummary } from 'oh-my-claudecode/hooks/subagent-tracker/session-replay';

const summary = getReplaySummary(process.cwd(), sessionId);

console.log(`Duration: ${summary.duration_seconds}s`);
console.log(`Agents: ${summary.agents_spawned} spawned, ${summary.agents_completed} completed`);
console.log(`Bottlenecks:`, summary.bottlenecks);
console.log(`Files touched:`, summary.files_touched);

Bottleneck Detection

The replay system automatically identifies bottlenecks:

Tools averaging >1s with 2+ calls
Per-agent tool timing analysis
Sorted by impact (highest avg time first)

HUD Integration

Presets

Preset	Focus	Elements
`minimal`	Clean status	Context bar only
`focused`	Task progress	Todos, agents, modes
`full`	Everything	All elements enabled
`analytics`	Cost tracking	Tokens, costs, efficiency
`dense`	Compact all	Compressed format

Configuration

Edit ~/.claude/settings.json:

{
  "omcHud": {
    "preset": "focused",
    "elements": {
      "agents": true,
      "todos": true,
      "contextBar": true,
      "analytics": true
    }
  }
}

Custom Elements

Element	Description
`agents`	Active agent count and status
`todos`	Todo progress (completed/total)
`ralph`	Ralph loop iteration count
`autopilot`	Autopilot phase indicator
`contextBar`	Context window usage %
`analytics`	Token/cost summary

Debugging Techniques

Identifying Slow Agents

Check the Observatory for agents running >2 minutes
Look for bottleneck indicators (tool averaging >1s)
Review tool_usage in agent state

import { getAgentPerformance } from 'oh-my-claudecode/hooks/subagent-tracker';

const perf = getAgentPerformance(process.cwd(), agentId);
console.log('Tool timings:', perf.tool_timings);
console.log('Bottleneck:', perf.bottleneck);

Detecting File Conflicts

When multiple agents modify the same file:

import { detectFileConflicts } from 'oh-my-claudecode/hooks/subagent-tracker';

const conflicts = detectFileConflicts(process.cwd());
conflicts.forEach(c => {
  console.log(`File ${c.file} touched by: ${c.agents.join(', ')}`);
});

Intervention System

OMC automatically detects problematic agents:

Intervention	Trigger	Action
`timeout`	Agent running >5 min	Kill suggested
`excessive_cost`	Cost >$1.00	Warning
`file_conflict`	Multiple agents on file	Warning

import { suggestInterventions } from 'oh-my-claudecode/hooks/subagent-tracker';

const interventions = suggestInterventions(process.cwd());
interventions.forEach(i => {
  console.log(`${i.type}: ${i.reason} → ${i.suggested_action}`);
});

Parallel Efficiency Score

Track how well your parallel agents are performing:

import { calculateParallelEfficiency } from 'oh-my-claudecode/hooks/subagent-tracker';

const eff = calculateParallelEfficiency(process.cwd());
console.log(`Efficiency: ${eff.score}%`);
console.log(`Active: ${eff.active}, Stale: ${eff.stale}, Total: ${eff.total}`);

100%: All agents actively working
<80%: Some agents stale or waiting
<50%: Significant parallelization issues

Stale Agent Cleanup

Clean up agents that exceed the timeout threshold:

import { cleanupStaleAgents } from 'oh-my-claudecode/hooks/subagent-tracker';

const cleaned = cleanupStaleAgents(process.cwd());
console.log(`Cleaned ${cleaned} stale agents`);

External Resources

Claude Performance Tracking Platforms

MarginLab.ai

MarginLab.ai provides external performance tracking for Claude models:

SWE-Bench-Pro daily tracking: Monitor Claude's performance on software engineering benchmarks
Statistical significance testing: Detect performance degradation with confidence intervals
Historical trends: Track Claude's capabilities over time
Model comparison: Compare performance across Claude model versions

Usage

Visit the platform to:

View current Claude model benchmark scores
Check historical performance trends
Set up alerts for significant performance changes
Compare across model versions (Opus, Sonnet, Haiku)

Community Resources

Resource	Description	Link
Claude Code Discord	Community support and tips	discord.gg/anthropic
OMC GitHub Issues	Bug reports and feature requests	GitHub Issues
Anthropic Documentation	Official Claude documentation	docs.anthropic.com

Model Performance Benchmarks

Track Claude's performance across standard benchmarks:

Benchmark	What It Measures	Where to Track
SWE-Bench	Software engineering tasks	MarginLab.ai
HumanEval	Code generation accuracy	Public leaderboards
MMLU	General knowledge	Anthropic blog

Best Practices

1. Monitor Token Usage Proactively

# Set up budget warnings in HUD
/oh-my-claudecode:hud
# Select "analytics" preset

2. Use Appropriate Model Tiers

Task Type	Recommended Model	Cost Impact
File lookup	Haiku	Lowest
Feature implementation	Sonnet	Medium
Architecture decisions	Opus	Highest

3. Enable Session Replay for Complex Tasks

Session replay is automatically enabled. Review replays after complex workflows:

# Find replay files
ls .omc/state/agent-replay-*.jsonl

# View recent events
tail -20 .omc/state/agent-replay-*.jsonl

4. Set Cost Limits

The default cost limit per agent is $1.00 USD. Agents exceeding this trigger warnings.

5. Review Bottlenecks Regularly

After completing complex tasks, check the replay summary:

const summary = getReplaySummary(cwd, sessionId);
if (summary.bottlenecks.length > 0) {
  console.log('Consider optimizing:', summary.bottlenecks[0]);
}

6. Clean Up Stale State

Periodically clean up old replay files and stale agent state:

import { cleanupReplayFiles } from 'oh-my-claudecode/hooks/subagent-tracker/session-replay';

cleanupReplayFiles(process.cwd()); // Keeps last 10 sessions

Troubleshooting

High Token Usage

Symptoms: Costs higher than expected, context window filling quickly

Solutions:

Use eco mode for token-efficient execution: eco fix all errors
Check for unnecessary file reads in agent prompts
Review the Agent Observatory in HUD (or replay logs) for agent-level breakdown
Enable cache - check cache efficiency in analytics

Slow Agent Execution

Symptoms: Agents running >5 minutes, low parallel efficiency

Solutions:

Check Observatory for bottleneck indicators
Review tool_usage for slow operations
Consider splitting large tasks into smaller agents
Use architect-low instead of architect for simple verifications

File Conflicts

Symptoms: Merge conflicts, unexpected file changes

Solutions:

Use ultrapilot mode for automatic file ownership
Check detectFileConflicts() before parallel execution
Review file_ownership in agent state
Use swarm mode with explicit task isolation

Missing Analytics Data

Symptoms: Empty cost reports, no session history

Solutions:

Run omc backfill to import historical transcripts
Verify HUD is running: /oh-my-claudecode:hud setup
Check .omc/state/ directory exists
Review token-tracking.jsonl for raw data

Stale Agent State

Symptoms: Observatory showing agents that aren't running

Solutions:

Run cleanupStaleAgents(cwd) programmatically
Delete .omc/state/subagent-tracking.json to reset
Check for orphaned lock files: .omc/state/subagent-tracker.lock

State Files Reference

File	Purpose	Format
`.omc/state/subagent-tracking.json`	Current agent states	JSON
`.omc/state/agent-replay-{id}.jsonl`	Session event timeline	JSONL
`.omc/state/token-tracking.jsonl`	Token usage log	JSONL
`.omc/state/analytics-summary-{id}.json`	Cached session summaries	JSON
`.omc/state/subagent-tracker.lock`	Concurrent access lock	Text

API Reference

Subagent Tracker

// Core tracking
getActiveAgentCount(directory: string): number
getRunningAgents(directory: string): SubagentInfo[]
getTrackingStats(directory: string): { running, completed, failed, total }

// Performance
getAgentPerformance(directory: string, agentId: string): AgentPerformance
getAllAgentPerformance(directory: string): AgentPerformance[]
calculateParallelEfficiency(directory: string): { score, active, stale, total }

// File ownership
recordFileOwnership(directory: string, agentId: string, filePath: string): void
detectFileConflicts(directory: string): Array<{ file, agents }>
getFileOwnershipMap(directory: string): Map<string, string>

// Interventions
suggestInterventions(directory: string): AgentIntervention[]
cleanupStaleAgents(directory: string): number

// Display
getAgentDashboard(directory: string): string
getAgentObservatory(directory: string): { header, lines, summary }

Session Replay

// Recording
recordAgentStart(directory, sessionId, agentId, agentType, task?, parentMode?, model?): void
recordAgentStop(directory, sessionId, agentId, agentType, success, durationMs?): void
recordToolEvent(directory, sessionId, agentId, toolName, eventType, durationMs?, success?): void
recordFileTouch(directory, sessionId, agentId, filePath): void

// Analysis
readReplayEvents(directory: string, sessionId: string): ReplayEvent[]
getReplaySummary(directory: string, sessionId: string): ReplaySummary

// Cleanup
cleanupReplayFiles(directory: string): number

FilesExpand file tree

PERFORMANCE-MONITORING.md

Latest commit

History

PERFORMANCE-MONITORING.md

File metadata and controls

Performance Monitoring Guide

Table of Contents

Overview

What You Can Monitor

Built-in Monitoring

Agent Observatory

Accessing the Observatory

Observatory Output

Status Indicators

Key Metrics

Token & Cost Analytics

CLI Commands

Real-time HUD Display

Backfill Historical Data

Session Replay

Event Types

Replay Files

Analyzing Replay Data

Bottleneck Detection

HUD Integration

Presets

Configuration

Custom Elements

Debugging Techniques

Identifying Slow Agents

Detecting File Conflicts

Intervention System

Parallel Efficiency Score

Stale Agent Cleanup

External Resources

Claude Performance Tracking Platforms

MarginLab.ai

Usage

Community Resources

Model Performance Benchmarks

Best Practices

1. Monitor Token Usage Proactively

2. Use Appropriate Model Tiers

3. Enable Session Replay for Complex Tasks

4. Set Cost Limits

5. Review Bottlenecks Regularly

6. Clean Up Stale State

Troubleshooting

High Token Usage

Slow Agent Execution

File Conflicts

Missing Analytics Data

Stale Agent State

State Files Reference

API Reference

Subagent Tracker

Session Replay

See Also