RFC: Add --help-agent flag for LLM-friendly structured help

# Agent Help Design: `--help-agent`

**Author**: Claude <claude@anthropic.com>
**Date**: 2025-11-11
**Status**: Draft - Seeking Review

## Overview

Add a `--help-agent` flag to otel-cli that outputs structured, machine-readable help specifically designed for LLM agents and other automation tools. This complements the human-friendly `--help` output with information that makes otel-cli more effective as a tool in an agent's toolkit.

## Motivation

When LLM agents use command-line tools, they need:
- **Type information** to generate valid commands
- **Validation rules** to avoid trial-and-error
- **Environment mappings** to understand configuration precedence
- **Relationships** between flags to avoid incompatible combinations
- **Common patterns** to quickly accomplish typical tasks

Traditional `--help` output is optimized for human readability, not machine parsing. Agent help provides structured data that reduces hallucination and improves first-attempt success rates.

## Design Principles

1. **Structured Output**: JSON format for easy parsing
2. **Complete Information**: Everything an agent needs to use the command correctly
3. **Self-Documenting**: Schema includes field descriptions
4. **Versioned**: Output includes schema version for future evolution
5. **Composable**: Works with any command/subcommand

## Output Format

### JSON Schema

```json
{
  "schema_version": "1.0",
  "command": {
    "name": "otel-cli span",
    "description": "create an OpenTelemetry span and send it",
    "usage_pattern": "otel-cli span [flags]"
  },
  "flags": {
    "name": {
      "short": "n",
      "type": "string",
      "default": "todo-generate-default-span-names",
      "required": false,
      "description": "set the name of the span",
      "env_var": "OTEL_CLI_SPAN_NAME",
      "validation": {
        "min_length": 1
      }
    },
    "service": {
      "short": "s",
      "type": "string",
      "default": "otel-cli",
      "required": false,
      "description": "set the name of the service/application",
      "env_var": "OTEL_SERVICE_NAME",
      "validation": {
        "min_length": 1
      }
    },
    "kind": {
      "short": "k",
      "type": "string",
      "default": "client",
      "required": false,
      "description": "set the trace kind",
      "env_var": "OTEL_CLI_SPAN_KIND",
      "validation": {
        "enum": ["internal", "server", "client", "producer", "consumer"]
      }
    },
    "endpoint": {
      "type": "string",
      "default": "",
      "required": false,
      "description": "host and port for the desired OTLP/gRPC or OTLP/HTTP endpoint",
      "env_var": "OTEL_EXPORTER_OTLP_ENDPOINT",
      "validation": {
        "pattern": "^(https?://)?[^:]+:\\d+$|^https?://.+$"
      }
    },
    "protocol": {
      "type": "string",
      "default": "",
      "required": false,
      "description": "desired OTLP protocol",
      "env_var": "OTEL_EXPORTER_OTLP_PROTOCOL",
      "validation": {
        "enum": ["grpc", "http/protobuf"]
      }
    },
    "attrs": {
      "short": "a",
      "type": "map[string]string",
      "default": {},
      "required": false,
      "description": "a comma-separated list of key=value attributes",
      "env_var": "OTEL_CLI_ATTRIBUTES",
      "validation": {
        "format": "key=value,key2=value2"
      }
    },
    "timeout": {
      "type": "duration",
      "default": "1s",
      "required": false,
      "description": "timeout for otel-cli operations",
      "env_var": "OTEL_CLI_TIMEOUT",
      "validation": {
        "format": "Go duration string (e.g., 1s, 500ms, 1m30s)"
      }
    },
    "fail": {
      "type": "bool",
      "default": false,
      "required": false,
      "description": "on failure, exit with a non-zero status",
      "env_var": "OTEL_CLI_FAIL"
    },
    "verbose": {
      "type": "bool",
      "default": false,
      "required": false,
      "description": "print errors on failure instead of always being silent",
      "env_var": "OTEL_CLI_VERBOSE"
    }
  },
  "flag_groups": {
    "connection": ["endpoint", "traces-endpoint", "logs-endpoint", "protocol", "insecure"],
    "tls": ["tls-ca-cert", "tls-client-cert", "tls-client-key", "tls-no-verify"],
    "span_identity": ["name", "kind", "service"],
    "span_timing": ["start", "end"],
    "trace_propagation": ["tp-required", "tp-carrier", "tp-ignore-env", "tp-print", "tp-export"]
  },
  "relationships": {
    "mutually_exclusive": [
      ["tp-print", "tp-export"]
    ],
    "requires": [
      {"flag": "tls-client-cert", "requires": ["tls-client-key"]},
      {"flag": "tls-client-key", "requires": ["tls-client-cert"]}
    ],
    "deprecated": [
      {"flag": "otlp-blocking", "message": "does nothing, please file an issue if you need this"},
      {"flag": "no-tls-verify", "use_instead": "tls-no-verify"}
    ]
  },
  "recipes": [
    {
      "name": "basic_span",
      "description": "Send a simple span to a local collector",
      "command": "otel-cli span --endpoint localhost:4317 --service my-app --name my-operation",
      "when_to_use": "Testing basic connectivity or adding telemetry to shell scripts"
    },
    {
      "name": "span_with_propagation",
      "description": "Create a span and export its traceparent for chaining",
      "command": "otel-cli span -s my-app -n step1 --tp-export > /tmp/trace.env && source /tmp/trace.env",
      "when_to_use": "Chaining multiple otel-cli invocations to create a trace hierarchy"
    },
    {
      "name": "span_with_timing",
      "description": "Create a span with explicit start and end times",
      "command": "START=$(date +%s.%N); sleep 1; otel-cli span -s my-app -n timed --start $START --end $(date +%s.%N)",
      "when_to_use": "Recording operations that already completed or timing shell operations"
    },
    {
      "name": "https_with_tls",
      "description": "Send span over HTTPS with custom CA certificate",
      "command": "otel-cli span --endpoint https://collector.example.com:4318 --tls-ca-cert /path/to/ca.pem -s my-app -n secure-span",
      "when_to_use": "Production deployments with TLS and custom certificate authorities"
    }
  ],
  "common_patterns": {
    "non_recording_mode": {
      "description": "otel-cli operates in non-recording mode by default - it does nothing unless endpoint is configured",
      "how_to_enable": "Set --endpoint flag or OTEL_EXPORTER_OTLP_ENDPOINT environment variable"
    },
    "error_handling": {
      "description": "By default, otel-cli fails silently with exit code 0 to avoid breaking scripts",
      "how_to_change": "Use --fail for non-zero exit codes and --verbose for error messages"
    },
    "trace_propagation": {
      "description": "Traces can be propagated across invocations using W3C traceparent format",
      "mechanisms": [
        "TRACEPARENT environment variable (read automatically)",
        "--tp-carrier file (read and write)",
        "--tp-print or --tp-export (output to stdout)"
      ]
    }
  },
  "environment": {
    "precedence": "CLI flags > Environment variables > Config file > Defaults",
    "variables": {
      "OTEL_EXPORTER_OTLP_ENDPOINT": "Sets default endpoint for all signals",
      "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT": "Sets endpoint specifically for traces",
      "OTEL_EXPORTER_OTLP_LOGS_ENDPOINT": "Sets endpoint specifically for logs",
      "OTEL_EXPORTER_OTLP_PROTOCOL": "Sets protocol (grpc or http/protobuf)",
      "OTEL_EXPORTER_OTLP_HEADERS": "Sets headers for OTLP connection",
      "OTEL_SERVICE_NAME": "Sets the service name",
      "TRACEPARENT": "W3C trace context for propagation",
      "OTEL_CLI_*": "Custom otel-cli specific variables"
    }
  },
  "exit_codes": {
    "0": "Success or silent failure (default behavior)",
    "1": "Error when --fail flag is set"
  },
  "examples": {
    "minimal": "otel-cli span",
    "typical": "otel-cli span --endpoint localhost:4317 -s my-service -n my-span",
    "complete": "otel-cli span --endpoint localhost:4317 --protocol grpc -s my-service -n my-span -k server --attrs env=prod,version=1.0 --fail --verbose"
  }
}
```

## Implementation Approach

### Phase 1: Core Infrastructure

1. **Add AgentHelp type** in `otelcli/agent_help.go`:
   - Define structs matching the JSON schema
   - Implement JSON marshaling

2. **Add --help-agent flag** to root command:
   - Global flag that works with any command
   - Outputs to stdout and exits cleanly

3. **Build metadata** from existing Cobra structures:
   - Traverse command tree
   - Extract flag definitions
   - Add manually-curated validation and relationship data

### Phase 2: Enrichment

4. **Add validation metadata** to flag definitions
5. **Document flag relationships** (mutually exclusive, requires, etc.)
6. **Create recipe library** for common use cases
7. **Add environment variable mappings**

### Phase 3: Testing & Documentation

8. **Functional tests** for agent help output
9. **Documentation** explaining the feature
10. **Examples** showing how agents can use the output

## Open Questions for Review

1. **Schema evolution**: How do we version the output as we add fields?
2. **Completeness**: What other metadata would be valuable for agents?
3. **Format alternatives**: Should we also support YAML or other formats?
4. **Subcommands**: Should `--help-agent` show all subcommands in one output?
5. **Recipes**: How many recipes are useful? Too many = noise, too few = not helpful
6. **Validation**: Should we include regex patterns or just enum values?

## Success Metrics

- Agents can generate valid otel-cli commands on first attempt
- Agents understand the precedence of config sources (flags vs env vars vs files)
- Agents know which flags work together and which conflict
- Agents can find relevant recipes for common tasks
- Reduced "trial and error" in agent-generated commands

## Future Enhancements

- `--help-agent --format yaml` for YAML output
- `--help-agent --recipes-only` for just the recipe library
- Integration with shell completion for agents that use zsh/bash
- OpenAPI-style specification for the full CLI surface

---

## Review Requests

**Gemini**: Please review this design and provide feedback on:
- Schema completeness - are we missing critical information?
- Structure - is the JSON organization logical and useful?
- Recipes - what other common patterns should we include?
- Any potential issues or improvements you see

**Human reviewers**: Does this align with the otel-cli philosophy of being shell-script friendly and self-contained?

---

🤖 Claude <claude@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Add --help-agent flag for LLM-friendly structured help #22

Agent Help Design: `--help-agent`

Overview

Motivation

Design Principles

Output Format

JSON Schema

Implementation Approach

Phase 1: Core Infrastructure

Phase 2: Enrichment

Phase 3: Testing & Documentation

Open Questions for Review

Success Metrics

Future Enhancements

Review Requests

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RFC: Add --help-agent flag for LLM-friendly structured help #22

Description

Agent Help Design: --help-agent

Overview

Motivation

Design Principles

Output Format

JSON Schema

Implementation Approach

Phase 1: Core Infrastructure

Phase 2: Enrichment

Phase 3: Testing & Documentation

Open Questions for Review

Success Metrics

Future Enhancements

Review Requests

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Agent Help Design: `--help-agent`