Strands CostGuard

A cost management library for the Strands Agents SDK with budget enforcement, adaptive model routing, and OpenTelemetry-compatible metrics.

Features

Budget Enforcement: Define budgets at tenant, strand, workflow, and run levels with configurable limits and actions
Adaptive Model Routing: Automatically route to fallback models based on budget utilization and other conditions
Cost Tracking: Track and attribute costs by tenant, strand, workflow, run, model, and tool
OpenTelemetry Metrics: Emit cost metrics compatible with OTel collectors for long-term storage and analysis
Flexible Policies: Configure via YAML files or environment variables
Persistent Budget State: Optional Valkey/Redis persistence for budget state across restarts

Requirements

Python 3.10+
Strands Agents SDK 0.1.0+

Installation

pip install strands-costguard

For persistence support:

pip install strands-costguard[valkey]

Quick Start

Running the Examples

# Install the package in development mode
pip install -e .

# Run the basic usage example
python examples/basic_usage.py

Basic Usage

from strands_costguard import (
    CostGuard,
    CostGuardConfig,
    FilePolicySource,
    ModelUsage,
)

# Initialize Cost Guard
config = CostGuardConfig(
    policy_source=FilePolicySource(path="./policies"),
    enable_budget_enforcement=True,
    enable_routing=True,
    enable_metrics=True,
)

guard = CostGuard(config=config)

# Start a run
decision = guard.on_run_start(
    tenant_id="prod-tenant",
    strand_id="analytics_assistant",
    workflow_id="data_analysis",
    run_id="run-123",
)

if not decision.allowed:
    print(f"Run rejected: {decision.reason}")
else:
    # Execute your agent loop...

    # Before model calls
    model_decision = guard.before_model_call(
        run_id="run-123",
        model_name="gpt-4o",
        stage="planning",
        prompt_tokens_estimate=500,
    )

    # Use the effective model (may be downgraded)
    effective_model = model_decision.effective_model

    # After model calls
    guard.after_model_call(
        run_id="run-123",
        usage=ModelUsage.from_response(
            model_name=effective_model,
            prompt_tokens=500,
            completion_tokens=200,
        ),
    )

    # End the run
    guard.on_run_end("run-123", "completed")

# Shutdown (flushes metrics)
guard.shutdown()

Configuration

Budget Policies (budgets.yaml)

budgets:
  - id: "tenant-default"
    scope: "tenant"
    match:
      tenant_id: "*"
    period: "monthly"
    max_cost: 1000.0
    soft_thresholds: [0.7, 0.9, 1.0]
    hard_limit: true
    on_soft_threshold_exceeded: "DOWNGRADE_MODEL"
    on_hard_limit_exceeded: "REJECT_NEW_RUNS"

  - id: "analytics-strand"
    scope: "strand"
    match:
      strand_id: "analytics_assistant"
    period: "daily"
    max_cost: 50.0
    max_runs_per_period: 1000
    max_concurrent_runs: 100
    constraints:
      max_iterations_per_run: 8
      max_tool_calls_per_run: 20
      max_model_tokens_per_run: 30000

Routing Policies (routing.yaml)

routing_policies:
  - id: "default-routing"
    match:
      strand_id: "*"
    stages:
      - stage: "planning"
        default_model: "gpt-4o-mini"
        max_tokens: 2000
      - stage: "synthesis"
        default_model: "gpt-4o"
        fallback_model: "gpt-4o-mini"
        trigger_downgrade_on:
          soft_threshold_exceeded: true
          remaining_budget_below: 5.0

Pricing Table (pricing.yaml)

pricing:
  currency: "USD"
  models:
    "gpt-4o":
      input_per_1k: 2.50
      output_per_1k: 10.00
    "gpt-4o-mini":
      input_per_1k: 0.15
      output_per_1k: 0.60
  tools:
    "web_search":
      cost_per_call: 0.01

Lifecycle Hooks

Cost Guard integrates with your agent runtime via lifecycle hooks:

Hook	When Called	Returns
`on_run_start()`	Before starting a new run	`AdmissionDecision`
`on_run_end()`	After a run completes	None
`before_iteration()`	Before each agent loop iteration	`IterationDecision`
`after_iteration()`	After each iteration completes	None
`before_model_call()`	Before each model call	`ModelDecision`
`after_model_call()`	After each model call	None
`before_tool_call()`	Before each tool call	`ToolDecision`
`after_tool_call()`	After each tool call	None

OpenTelemetry Metrics

Enabling OTLP Export

To export metrics to an OpenTelemetry collector, configure StrandsTelemetry before initializing CostGuard:

from strands.telemetry.config import StrandsTelemetry
from strands_costguard import CostGuard, CostGuardConfig, FilePolicySource

# Configure telemetry with OTLP export
telemetry = StrandsTelemetry()
telemetry.setup_otlp_exporter(endpoint="http://localhost:4317")
telemetry.setup_meter(enable_otlp_exporter=True)

# Initialize CostGuard (will use the global MeterProvider)
config = CostGuardConfig(
    policy_source=FilePolicySource(path="./policies"),
    enable_metrics=True,
)
guard = CostGuard(config=config)

Requirements:

An OpenTelemetry collector running at the specified endpoint (default: localhost:4317)

For local development, you can run a collector with Docker:

docker run -p 4317:4317 otel/opentelemetry-collector:latest

Disabling OTLP Export:

If you don't have a collector running, disable OTLP export to avoid connection errors:

telemetry.setup_meter(enable_otlp_exporter=False)

Metrics Reference

Cost Guard emits the following metrics:

Metric	Type	Description
`genai.cost.total`	Counter	Total cost in currency units
`genai.cost.model`	Counter	Cost per model
`genai.cost.tool`	Counter	Cost per tool
`genai.tokens.input`	Counter	Total input tokens
`genai.tokens.output`	Counter	Total output tokens
`genai.agent.iterations`	Counter	Agent loop iterations
`genai.agent.tool_calls`	Counter	Tool calls
`genai.cost.downgrade_events`	Counter	Model downgrade events
`genai.cost.rejection_events`	Counter	Run rejection events

Metrics include resource attributes:

service.name, service.namespace, deployment.environment
strands.tenant_id, strands.strand_id, strands.workflow_id

Budget Scopes and Priority

Budgets can be defined at multiple scopes, with higher priority scopes taking precedence:

Global (lowest priority) - Default limits for all
Tenant - Organization-level limits
Strand - Agent definition limits
Workflow (highest priority) - Specific workflow limits

When multiple budgets match, constraints are merged with more specific budgets taking priority.

Threshold Actions

When budget soft thresholds are exceeded:

Action	Effect
`LOG_ONLY`	Log warning, continue normally
`DOWNGRADE_MODEL`	Switch to fallback models
`LIMIT_CAPABILITIES`	Reduce max tokens/iterations
`HALT_NEW_RUNS`	Reject new runs

When hard limits are exceeded:

Action	Effect
`HALT_RUN`	Stop the current run
`REJECT_NEW_RUNS`	Reject new runs only

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy src/

# Linting
ruff check src/

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src/strands_costguard		src/strands_costguard
tests		tests
.coverage		.coverage
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
env.example		env.example
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Strands CostGuard

Features

Requirements

Installation

Quick Start

Running the Examples

Basic Usage

Configuration

Budget Policies (budgets.yaml)

Routing Policies (routing.yaml)

Pricing Table (pricing.yaml)

Lifecycle Hooks

OpenTelemetry Metrics

Enabling OTLP Export

Metrics Reference

Budget Scopes and Priority

Threshold Actions

Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Strands CostGuard

Features

Requirements

Installation

Quick Start

Running the Examples

Basic Usage

Configuration

Budget Policies (budgets.yaml)

Routing Policies (routing.yaml)

Pricing Table (pricing.yaml)

Lifecycle Hooks

OpenTelemetry Metrics

Enabling OTLP Export

Metrics Reference

Budget Scopes and Priority

Threshold Actions

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages