Skip to content

Latest commit

 

History

History
283 lines (198 loc) · 8.43 KB

File metadata and controls

283 lines (198 loc) · 8.43 KB

Gates Guide

Gates are quality validation mechanisms that ensure Claude's outputs meet specific criteria before proceeding.

Gate Types

Type Validation Method Best For
Criteria Gates LLM self-evaluation Subjective quality checks
Shell Verification Exit code (ground truth) Test suites, linting, builds
Canonical Gates Pre-defined standards Reusable quality patterns

Criteria Gates (LLM Self-Evaluation)

Criteria gates use inline text criteria that Claude evaluates against its own output.

Syntax

# Single criterion
>>prompt :: "criteria text"

# Multiple criteria
>>prompt :: "criterion 1" :: "criterion 2"

# Named gate reference
>>prompt :: code-quality

Examples

# Conciseness check
>>summarize :: "under 200 words"

# Content requirements
>>analyze :: "include statistics" :: "cite sources"

# Style enforcement
>>write-docs :: "use active voice" :: "include code examples"

How It Works

  1. Claude executes the prompt
  2. Gate criteria are injected into the response context
  3. Claude self-evaluates: GATE_REVIEW: PASS|FAIL - reason
  4. If FAIL, automatic retry with feedback (up to 2 attempts)
  5. After max retries, user decides via gate_action

Shell Verification Gates (Ground Truth)

Shell verification uses actual command execution for validation—exit code 0 = PASS, non-zero = FAIL.

Syntax

# Basic
:: verify:"command"

# With options
:: verify:"command" max:N timeout:N

# With presets
:: verify:"command" :fast|:full|:extended

Presets

Preset Max Attempts Timeout Use Case
:fast 1 30s Quick feedback
:full 5 300s CI validation
:extended 10 600s Long tests

Examples

# Run tests after implementation
>>implement-feature :: verify:"npm test"

# Quick lint check
>>cleanup :: verify:"npm run lint" :fast

# Full test suite
>>refactor :: verify:"npm test" :full

# Combined with criteria
>>implement :: verify:"npm test" :: "follows coding standards"

Options

Option Default Description
max:N 5 Maximum attempts
timeout:N 300 Timeout in seconds
loop:true false Stop hook integration

See Ralph Loops Guide for comprehensive shell verification documentation.

Canonical Gates

Pre-defined gates stored in resources/gates/ for reusable quality patterns.

Available Gates
Gate ID Severity Purpose
code-quality medium Error handling, naming, edge cases
security-awareness medium No secrets, input validation
test-coverage medium Tests included
content-structure low Headers, lists, examples
api-documentation medium Endpoints, params, examples
pr-security critical No eval, parameterized queries
pr-performance medium Memoization, no console.log
plan-quality high Files, risks, assumptions

Usage

# Reference by ID
>>code_review :: code-quality :: security-awareness

# Combine with inline criteria
>>implement :: code-quality :: "under 500 lines"
User Gates (Workspace Overlays)

When MCP_WORKSPACE points to a directory outside the package root, the server automatically discovers additional gates from the workspace. This allows users to define custom gates alongside shipped defaults.

Directory Structure

User gates support both flat and grouped layouts:

${MCP_WORKSPACE}/gates/          # Workspace gates directory
├── my-custom-gate/              # Flat: directly under gates/
│   ├── gate.yaml
│   └── guidance.md
└── workflow/                    # Grouped: category → gate
    ├── pre-flight-completion/
    │   ├── gate.yaml
    │   └── guidance.md
    └── growth-capture/
        ├── gate.yaml
        └── guidance.md

The server also checks ${MCP_WORKSPACE}/resources/gates/ as an alternative convention.

Conflict Resolution

When a user gate has the same ID as a shipped gate, the shipped (primary) gate wins. This prevents accidental overrides of built-in quality standards.

Example: Claude Code Integration

When using the Claude Code plugin with MCP_WORKSPACE=~/.claude/:

~/.claude/gates/
└── workflow/
    ├── pre-flight-completion/
    │   ├── gate.yaml
    │   └── guidance.md
    └── diagnosis-card/
        ├── gate.yaml
        └── guidance.md

These gates appear in system_control(action:"gates", operation:"list") alongside shipped gates.

Hot Reload

User gates are hot-reloaded. Editing gate.yaml or guidance.md in workspace gates directories updates the gate without server restart.

Gate Responses

Warning

The response format is strict: GATE_REVIEW: PASS - reason or GATE_REVIEW: FAIL - reason. Omitting the prefix or using a different format causes the gate to hang waiting for a verdict.

Pass Response

GATE_REVIEW: PASS - All criteria met. Code includes error handling and follows naming conventions.

Fail Response (Retry Available)

GATE_REVIEW: FAIL - Missing error handling for edge case X.

[Claude automatically retries with this feedback]

Escalation (Max Retries)

After max attempts, user is prompted for gate_action:

  • retry - Reset attempts and try again
  • skip - Continue without validation
  • abort - Stop execution
prompt_engine(chain_id:"chain-abc", gate_action:"retry")

Combining Gates

Gates can be combined with other operators:

# Framework + Gate
@CAGEERF >>analyze :: "comprehensive analysis"

# Chain + Gate (gate applies to final step)
>>research --> >>analyze :: "cite sources"

# Style + Gate
#analytical >>report :: "include data visualizations"

# Multiple gate types
>>implement :: verify:"npm test" :: code-quality :: "follows DRY principle"

Assertion + Gate Composition

Gates validate content quality (subjective, LLM-evaluated). Assertions validate structure (deterministic, zero-cost). They compose orthogonally:

Layer Validates Cost Method
Assertions Structure (sections, length, terms) Zero Deterministic checks
Gates (self) Content quality LLM cost Self-review
Gates (judge) Content quality LLM cost Context-isolated review

When assertions pass, the gate reviewer is told: "Structure is verified — focus on content quality." When assertions fail, the LLM must fix structural issues before content quality is evaluated.

See Assertions Guide for full details.

Judge Mode

By default, the same LLM evaluates its own gate criteria (self mode). Judge mode sends output + criteria to a context-isolated sub-agent that cannot see generation reasoning:

# In gate.yaml
evaluation:
  mode: judge          # Context-isolated evaluation
  strict: true         # Evidence-based: list failures first

See Judge Mode Guide for configuration and usage.

Best Practices

  1. Use shell verification for objective criteria (tests, linting, builds)
  2. Use criteria gates for subjective quality (style, completeness)
  3. Use assertions for structural compliance (methodology phases, required sections)
  4. Use judge mode for high-stakes evaluation (prevents self-confirmation bias)
  5. Combine layers for comprehensive validation:
    >>implement :: verify:"npm test" :: "readable code" :: "documented functions"
  6. Use presets for consistent verification across projects
  7. Reference canonical gates for team-wide standards

Tip

Too many gates firing? Injection Control lets you tune how often gate guidance injects — from every step to first-step-only. For the full gate.yaml schema, see Gate Configuration Reference.

See Also