This guide shows how to run, compose, and extend the CUGAR Agent stack via CLI and Python. All commands assume Python >=3.10 and uv for environment management.
uv sync --all-extras --dev
uv run playwright install --with-deps chromium
cp .env.example .envSet provider keys (e.g., OPENAI_API_KEY, LANGFUSE_SECRET) inside .env or your shell.
The Typer CLI is exposed as uv run cuga.
uv run cuga start demo- Starts registry + demo tool servers on the default sandbox profile.
- Uses
configs/agent.demo.yamlfor model + policy defaults.
uv run python examples/run_langgraph_demo.py --goal "Draft a changelog from pull request notes" \
--profile demo_power --observability langfuse- Plans with ReAct; executes via LangChain tool runtime.
- Sends traces to Langfuse if
LANGFUSE_SECRETis set.
uv run cuga registry list --profile demo_power
uv run cuga profile validate --profile demo_power- Registries live in
config/andregistry.yaml. - Profiles apply sandbox isolation and guardrail enforcement.
uv run python scripts/load_corpus.py --source rag/sources --backend chroma
uv run python examples/rag_query.py --query "How does the planner select tools?" --backend chroma- Uses
memory/vector_store.pywith in-memory fallback when no backend is reachable.
uv run python examples/multi_agent_dispatch.py --goal "Summarize docs and propose next steps"- Demonstrates coordinator/worker/tool-user roles with shared memory summaries.
from cuga.modular.agents import PlannerAgent, WorkerAgent
from cuga.modular.tools import ToolRegistry, ToolSpec
from cuga.modular.memory import VectorMemory
registry = ToolRegistry([
ToolSpec(name="echo", description="echo text", handler=lambda i, c: i["text"]),
])
memory = VectorMemory()
planner = PlannerAgent(registry=registry, memory=memory)
worker = WorkerAgent(registry=registry, memory=memory)
plan = planner.plan(goal="echo hello", metadata={"profile": "demo"})
result = worker.execute(plan.steps)
print(result.output)- Implement
ToolSpecintools/registry.pyor wrap an MCP server. - Register it via
ToolRegistry([...])or YAML inconfigs/tools.yaml. - Add tests under
tests/to cover handler success/failure.
- Configure
configs/rag.yamlwith storage path or remote vector DB. - Use
rag/loader.pyto ingest content andrag/retriever.pyfor queries. - Toggle
RAG_ENABLED=truein.envto opt-in.
Configuration precedence follows the canonical order described in AGENTS.md (Configuration Policy):
- CLI arguments (highest precedence)
- Environment variables (for Dynaconf use
DYNACONF_<SECTION>__<KEY>or explicit envs likeAGENT_*,OTEL_*) .envfiles (project.env,ops/env/*.env,.env.mcp)- YAML configs (e.g.,
registry.yaml,configs/*.yaml) - TOML configs (e.g.,
settings.toml,eval_config.toml) - Configuration defaults (Dynaconf validators)
- Hardcoded defaults (lowest precedence)
Notes:
- Deep merges are used for nested dictionaries (dicts merge), while lists are replaced (no implicit list merging).
- Use
DYNACONF_envvar prefixes for Dynaconf-managed sections when you need to override nested values from the environment (e.g.DYNACONF_ADVANCED_FEATURES__LITE_MODE=false).
Example: override message window limit via env
export DYNACONF_ADVANCED_FEATURES__MESSAGE_WINDOW_LIMIT=50Guardrail & policy configuration examples (YAML snippets)
configs/guardrail_policy.yaml (example)
tool_allowlist:
- filesystem_read
- web_search
tool_denylist:
- dangerous_tool
parameter_schemas:
filesystem_read:
path:
type: string
required: true
pattern: "^[a-zA-Z0-9/_\-\.]+$"
network_egress:
allowed_domains:
- api.openai.com
- example.com
block_localhost: true
block_private_networks: true
budget:
AGENT_BUDGET_CEILING: 100
AGENT_BUDGET_POLICY: warn
AGENT_ESCALATION_MAX: 2Loading precedence sanity check (local)
# 1) Make a small TOML
cat > /tmp/test_settings.toml <<'TOML'
[features]
thoughts = false
TOML
# 2) Override via env
export DYNACONF_FEATURES__THOUGHTS=true
# 3) Create Dynaconf instance (Python)
python - <<'PY'
from dynaconf import Dynaconf
settings = Dynaconf(settings_files=['/tmp/test_settings.toml'], envvar_prefix='DYNACONF')
print('thoughts:', settings.features.thoughts)
PY- Langfuse: set
LANGFUSE_SECRET+LANGFUSE_PUBLIC_KEY; calls are emitted viaobservability/langfuse.py. - OpenInference/Traceloop: enable with
OPENINFERENCE_ENABLED=trueand configure URLs inconfigs/observability.yaml. - All emitters redact secrets and run in the current profile sandbox.
- Ports 7860/8000/8001/9000 must be free for demos.
- If Playwright browsers are missing, re-run
uv run playwright install --with-deps chromium. - Use
--verboseflags for detailed logs; traces are stored underlogs/when enabled.