docs: add competitive comparison page (vs NeMo, Guardrails AI, LiteLLM, Portkey)

Copilot · imran-siddique · web-flow · commit 091c70db43a9 · 2026-03-15T10:46:56.000-07:00
* Initial plan

* docs: add COMPARISON.md competitive comparison page and link from README

Co-authored-by: imran-siddique &lt;45405841+imran-siddique@users.noreply.github.com&gt;

---------

Co-authored-by: copilot-swe-agent[bot] &lt;198982749+Copilot@users.noreply.github.com&gt;
Co-authored-by: imran-siddique &lt;45405841+imran-siddique@users.noreply.github.com&gt;
diff --git a/README.md b/README.md
@@ -186,6 +186,7 @@ Full methodology, per-adapter breakdowns, and memory profiling: **[BENCHMARKS.md
 - **[Azure Deployment Guides](docs/deployment/README.md)** — AKS, Azure AI Foundry, Container Apps, OpenClaw sidecar
 - **[NIST RFI Mapping](docs/nist-rfi-mapping.md)** — Question-by-question mapping to NIST AI Agent Security RFI (2026-00206)
 - [OWASP Compliance Mapping](docs/OWASP-COMPLIANCE.md)
+- [Competitive Comparison](docs/COMPARISON.md) — vs. NeMo Guardrails, Guardrails AI, LiteLLM, Portkey
 - [CSA Agentic Trust Framework Mapping](docs/CSA-ATF-PROPOSAL.md)
 - [Performance Benchmarks](BENCHMARKS.md)
 - [Changelog](CHANGELOG.md)
diff --git a/docs/COMPARISON.md b/docs/COMPARISON.md
@@ -0,0 +1,172 @@
+# Competitive Comparison: Agent Governance Toolkit vs. Alternatives
+
+> **TL;DR:** They guard LLM outputs. We govern agent actions. Complementary, not competing.
+
+---
+
+## Overview
+
+When evaluating agent security tooling, developers often encounter [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails), [Guardrails AI](https://github.com/guardrails-ai/guardrails), [LiteLLM](https://github.com/BerriAI/litellm), and [Portkey](https://portkey.ai/). These are widely-used, well-regarded tools — but they solve a fundamentally **different problem**.
+
+| Tool | Core Focus | Primary User |
+|------|-----------|--------------|
+| **Agent Governance Toolkit** | Agent action governance, identity, sandboxing, SRE | Platform / security teams deploying autonomous agents |
+| NeMo Guardrails | Conversational rail constraints on LLM responses | Developers building chatbots and dialog systems |
+| Guardrails AI | LLM output validation and structured data extraction | Developers needing reliable structured outputs from LLMs |
+| LiteLLM | Unified LLM API gateway / proxy | Teams managing multi-provider LLM access |
+| Portkey | LLM observability, caching, and routing gateway | Teams optimizing LLM cost, reliability, and visibility |
+
+---
+
+## Feature Comparison
+
+| Feature | Agent Governance Toolkit | NeMo Guardrails | Guardrails AI | LiteLLM | Portkey |
+|---------|:----------------------:|:---------------:|:-------------:|:-------:|:-------:|
+| **Agent action governance** | ✅ | ❌ | ❌ | ❌ | ❌ |
+| **LLM output validation** | ✅ (via [content-policy adapters](../packages/agent-os/)) | ✅ | ✅ | ✅ | ✅ |
+| **Agent identity (cryptographic)** | ✅ Ed25519 / SPIFFE | ❌ | ❌ | ❌ | ❌ |
+| **Execution sandboxing** | ✅ 4-tier rings | ❌ | ❌ | ❌ | ❌ |
+| **SRE (SLOs / error budgets)** | ✅ | ❌ | ❌ | ❌ | ❌ |
+| **Inter-agent trust mesh** | ✅ | ❌ | ❌ | ❌ | ❌ |
+| **Least-privilege capability model** | ✅ | ❌ | ❌ | ❌ | ❌ |
+| **Deterministic pre-execution enforcement** | ✅ < 0.1 ms | ❌ | ❌ | ❌ | ❌ |
+| **Chaos / replay testing** | ✅ | ❌ | ❌ | ❌ | ❌ |
+| **OWASP Agentic Top 10 coverage** | **10 / 10** | ~2 / 10 ¹ | ~1 / 10 ¹ | ~0 / 10 ¹ | ~1 / 10 ¹ |
+| **Framework integrations** | **12+** | 3 (LangChain, NeMo-based, custom) | 2 (LangChain, custom) | N/A (gateway) | N/A (gateway) |
+| **LLM provider routing / caching** | ❌ | ❌ | ❌ | ✅ | ✅ |
+| **Works alongside existing tools** | ✅ | ✅ | ✅ | ✅ | ✅ |
+
+> ¹ **OWASP scoring methodology:** Each tool was assessed against the ten [OWASP Agentic Top 10 (2026)](https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/) risk categories. A risk is counted as "covered" only when the tool provides a mitigation that addresses the root cause of that risk category (not merely partial or indirect coverage). Scores for NeMo, Guardrails AI, LiteLLM, and Portkey are approximate because none of those tools publish explicit OWASP Agentic Top 10 mappings; they are based on a good-faith review of each tool's documented capabilities as of early 2026.
+
+---
+
+## Detailed Breakdown
+
+### NeMo Guardrails (NVIDIA)
+
+**What it does:** Adds conversational guardrails to LLM-based chatbots — blocking off-topic requests, enforcing dialog flows (Colang), and filtering harmful outputs in real time.
+
+**Where it excels:**
+- Chatbot safety and topicality constraints
+- Structured dialog flow control (Colang DSL)
+- Programmable input/output filters
+
+**What it doesn't cover:**
+- Governing *what an agent does* (tool calls, sub-agent spawning, file writes, API invocations)
+- Agent identity or authentication between agents
+- Runtime privilege rings or sandboxing
+- SRE / reliability patterns (SLOs, circuit breakers)
+- OWASP Agentic Top 10 risks beyond output filtering (~ASI-05)
+
+**Best used:** Alongside the Agent Governance Toolkit when you want chatbot-level dialog safety **and** full agentic action governance.
+
+---
+
+### Guardrails AI
+
+**What it does:** Validates and coerces LLM outputs into structured formats (JSON schemas, Pydantic models) — ensuring outputs conform to expected shapes and correcting them via re-prompting when they don't.
+
+**Where it excels:**
+- Reliable structured data extraction from LLM responses
+- Output schema enforcement and type coercion
+- Re-prompting pipelines for malformed outputs
+
+**What it doesn't cover:**
+- Any form of pre-execution action governance
+- Agent identity or trust between agents
+- Execution sandboxing or privilege rings
+- SRE / error budgets
+
+**Best used:** As a companion for output parsing. The Agent Governance Toolkit handles what an agent *does*; Guardrails AI handles what an LLM *says*.
+
+---
+
+### LiteLLM
+
+**What it does:** Provides a unified API gateway that abstracts over 100+ LLM providers behind a single OpenAI-compatible interface — including routing, load balancing, spend tracking, and basic content moderation hooks.
+
+**Where it excels:**
+- Multi-provider LLM management from a single API
+- Spend tracking and budget enforcement per model/team
+- Basic content policy hooks at the LLM call level
+
+**What it doesn't cover:**
+- Agent-level governance (pre-execution policy checks on tool calls, spawns, etc.)
+- Agent identity, trust scoring, or zero-trust mesh
+- Execution sandboxing
+- SRE patterns (SLOs, chaos testing, circuit breakers)
+
+**Best used:** As a transparent LLM proxy in front of any provider while the Agent Governance Toolkit enforces what the calling agent is allowed to do.
+
+---
+
+### Portkey
+
+**What it does:** A production LLM gateway providing observability, semantic caching, routing fallbacks, and prompt management — focused on LLM operational reliability and cost optimization.
+
+**Where it excels:**
+- LLM call observability and tracing
+- Semantic caching to reduce cost
+- Routing fallbacks across providers
+- Prompt versioning and A/B testing
+
+**What it doesn't cover:**
+- Agent action governance (tool calls are invisible to Portkey)
+- Agent identity or cryptographic attestation
+- Execution sandboxing or privilege isolation
+- SRE / reliability engineering at the *agent* level
+
+**Best used:** As a telemetry and cost-optimization layer for LLM calls while the Agent Governance Toolkit enforces governance on the agent's actions.
+
+---
+
+## The Key Distinction
+
+```
+LLM Output Layer (NeMo, Guardrails AI, Portkey, LiteLLM)
+  └─ "Did the model say something safe / structured / on-topic?"
+
+Agent Action Layer (Agent Governance Toolkit)
+  └─ "Should this agent be allowed to execute this action right now?"
+```
+
+These two layers are **complementary, not competing**. A fully governed agentic system typically needs both:
+
+1. **Agent Governance Toolkit** — enforces *what agents do* before every tool call, spawn, or API invocation, with cryptographic identity, privilege rings, SRE reliability, and full OWASP Agentic Top 10 coverage.
+2. **An output validator** (Guardrails AI, NeMo) — ensures the LLM's *words* conform to the format and safety rules you need.
+3. **An LLM gateway** (LiteLLM, Portkey) — routes, caches, and observes the underlying model calls.
+
+---
+
+## OWASP Agentic Top 10 Coverage Detail
+
+| Risk | Agent Governance Toolkit | NeMo Guardrails | Guardrails AI | LiteLLM | Portkey |
+|------|:------------------------:|:---------------:|:-------------:|:-------:|:-------:|
+| ASI-01 Agent Goal Hijacking | ✅ Policy engine blocks unauthorized goal changes | ⚠️ Partial (dialog rails) | ❌ | ❌ | ❌ |
+| ASI-02 Excessive Capabilities | ✅ Capability model enforces least-privilege | ❌ | ❌ | ❌ | ❌ |
+| ASI-03 Identity & Privilege Abuse | ✅ Ed25519 / SPIFFE zero-trust identity | ❌ | ❌ | ❌ | ❌ |
+| ASI-04 Uncontrolled Code Execution | ✅ 4-tier execution rings + sandboxing | ❌ | ❌ | ❌ | ❌ |
+| ASI-05 Insecure Output Handling | ✅ Content policies validate all outputs | ✅ Output filters | ✅ Schema validation | ⚠️ Basic hooks | ❌ |
+| ASI-06 Memory Poisoning | ✅ Episodic memory with integrity checks | ❌ | ❌ | ❌ | ❌ |
+| ASI-07 Unsafe Inter-Agent Communication | ✅ Encrypted channels + trust gates | ❌ | ❌ | ❌ | ❌ |
+| ASI-08 Cascading Failures | ✅ Circuit breakers + SLO enforcement | ❌ | ❌ | ⚠️ Retries only | ⚠️ Fallback routing |
+| ASI-09 Human-Agent Trust Deficit | ✅ Full audit trails + flight recorder | ❌ | ❌ | ⚠️ Logging | ⚠️ Observability |
+| ASI-10 Rogue Agents | ✅ Kill switch + ring isolation + anomaly detection | ❌ | ❌ | ❌ | ❌ |
+
+---
+
+## Summary
+
+If your question is:
+
+- *"How do I stop my agent from calling tools it shouldn't?"* → **Agent Governance Toolkit**
+- *"How do I ensure my LLM always returns valid JSON?"* → **Guardrails AI**
+- *"How do I add topicality constraints to my chatbot?"* → **NeMo Guardrails**
+- *"How do I route across 100+ LLM providers with one API?"* → **LiteLLM**
+- *"How do I observe and cache my LLM calls?"* → **Portkey**
+
+For production agentic systems, you likely need the Agent Governance Toolkit **plus** one or more of the above tools working together.
+
+---
+
+*See also: [OWASP Compliance Mapping](OWASP-COMPLIANCE.md) · [Architecture Overview](../README.md#architecture) · [Quick Start](../QUICKSTART.md)*