Skip to content

[XPack][Watcher]: Support for Keystore Variables in Watch Definitions for Secure Secret Management #136001

@SrivatsaRv

Description

@SrivatsaRv

Description

Watcher: Support for keystore variables in watch definitions for secure secret management

Summary

Watcher actions that call external systems like DataDog, PagerDuty, Slack, or custom webhooks currently require plaintext credentials inside the watch JSON. Even with xpack.watcher.encrypt_sensitive_data=true, secrets are still present in the definition that is stored in source control and passed through CI/CD.

This proposal requests keystore variable support inside watch definitions, so that headers, auth blocks, and similar fields can reference secure settings, aligned with how elasticsearch.yml already supports secure settings.


Current limitation

Elasticsearch 8.17.1 example:

{
  "actions": {
    "datadog_webhook": {
      "webhook": {
        "method": "post",
        "url": "https://api.datadoghq.eu/api/v2/incidents",
        "headers": {
          "Accept": "application/json",
          "Content-Type": "application/json",
          "DD-API-KEY": "plaintext-api-key-here",
          "DD-APPLICATION-KEY": "plaintext-app-key-here"
        },
        "body": "{{#toJson}}ctx.payload{{/toJson}}"
      }
    }
  }
}

Attempting to use a keystore reference fails:

{
  "headers": {
    "DD-API-KEY": "{{_secrets.DD_API_KEY}}"
  }
}

Result:

unknown secure setting [DD_API_KEY]

Notes:

  • Keystore only accepts predefined secure settings today.
  • There is no supported interpolation for secrets within watch definitions.

Proposed solution

Enable keystore variable interpolation in Watcher definitions. Two compatible options are outlined below. Either would solve the problem.

Option A. Inline interpolation using ${...}

{
  "headers": {
    "DD-API-KEY": "${keystore.dd_api_key}",
    "DD-APPLICATION-KEY": "${keystore.dd_app_key}"
  }
}

Option B. A dedicated secrets block resolved at runtime

{
  "actions": {
    "datadog_webhook": {
      "webhook": {
        "url": "https://api.datadoghq.eu/api/v2/incidents",
        "headers": {
          "DD-API-KEY": "@secret:dd_api_key",
          "DD-APPLICATION-KEY": "@secret:dd_app_key"
        }
      }
    }
  }
}
  • @secret:<name> is resolved from the Elasticsearch keystore on the node at execution time.
  • Resolution should not leak values into stored watch source or logs.

Common expectations for both options

  • Resolution occurs only during execution.
  • Values are never written back into the stored watch JSON.
  • Values do not appear in logs, watch history, or error traces.
  • Works anywhere a string can appear in watch definitions where secrets are typical:
    • headers, auth fields, email creds, webhook tokens.

Why this matters

  • Keeps credentials out of Git and code review systems.
  • Avoids secret exposure in CI/CD pipelines and PR diffs.
  • Aligns Watcher with existing secure settings usage patterns in elasticsearch.yml.
  • Reduces risk and operational burden for common integrations.

Use cases

  • DataDog, PagerDuty, Slack webhooks.
  • SMTP credentials for email actions.
  • Custom webhook tokens and Basic Auth headers.
  • Any third party API keys required by Watcher actions.

Current workarounds and their gaps

  1. Encryption at rest: xpack.watcher.encrypt_sensitive_data=true
    • Does not prevent plaintext exposure in the definition itself.
  2. External templating: inject secrets before deployment
    • Adds bespoke tooling and risks accidental leakage in logs or artifacts.
  3. Environment variables in containers
    • Limited to specific deployments and still requires templating into the watch JSON.

Security and UX considerations

  • Secrets should never be returned by the Get Watch API or appear in history JSON.
  • Validation should fail fast if a referenced keystore key is missing.
  • Add clear docs that show supported fields and example usage.
  • Ensure audit logging records that a secret reference was used without revealing the value.

Environment details

  • Elasticsearch version: 8.17.1
  • Deployment: Docker and GitOps style pipelines
  • Watcher usage: outbound webhooks to third party systems

Alternatives considered

Approach Pros Cons
Continue external templating Simple to start Tooling drift, risk of leakage, inconsistent across teams
Custom proxy that injects creds Centralises secrets Extra component to operate and secure
Dedicated Watcher secret store Purpose built New surface area to build and maintain
Keystore interpolation in definitions Reuses trusted mechanism, consistent UX Requires Watcher to resolve at runtime

Why this should be prioritised

This limitation is a security and operational blocker for teams that rely on Elasticsearch Watcher as part of production-grade monitoring and business alerting pipelines. Unlike Kibana alerting (which primarily handles visual or threshold-based alerts), Watcher is far more expressive and allows for complex, query-driven automation that directly operates on Elasticsearch indices.

Without keystore variable support, the current design forces plaintext credential management, creating an unavoidable security trade-off in enterprise environments that follow GitOps or CI/CD-based watcher deployments.

Key reasons for prioritisation

  1. Security non-compliance risk

    • Credentials for Datadog, PagerDuty, and other integrations must be stored in plain JSON or templated pipelines.
    • Violates internal security and compliance policies (e.g., ISO 27001, SOC 2) that prohibit storing secrets in source control.
    • Even with xpack.watcher.encrypt_sensitive_data=true, the definition payload remains readable.
  2. GitOps and CI/CD adoption blocker

    • Modern enterprises deploy watchers through automated pipelines (Terraform, Ansible, GitLab CI).
    • Secrets cannot be injected dynamically, as keystore values are not referenceable in the JSON definitions.
    • This forces the use of insecure templating layers or manual editing, breaking GitOps integrity.
  3. Operational friction in collaboration

    • Teams cannot safely share or review watchers across environments.
    • Pull requests or watcher audits expose API keys, creating review bottlenecks and the need for redaction policies.
  4. Feature gap with existing keystore usage

    • The Elasticsearch keystore already supports secure variable substitution for configuration files.
    • Extending this pattern into watchers would ensure consistent secret handling across all parts of the stack.
  5. Watcher remains essential where Kibana alerts fall short

    • Kibana alerting supports simple threshold or condition-based alerts bound to UI constructs.
    • Watcher is required for:
      • Multi-index and multi-query aggregation alerts.
      • Correlation or anomaly detection logic expressed in Painless scripts.
      • Chained conditions with conditional actions or external integrations.
    • Teams building business logic or revenue-impact alerts (e.g., transaction drops, latency anomalies) depend on Watcher’s flexibility.
    • Therefore, the inability to secure watcher definitions directly impacts production alerting reliability.
  6. Alignment with security-first ecosystem practices

    • All major observability and alerting systems (e.g., Prometheus, Grafana, Datadog monitors) support runtime secret injection.
    • Elasticsearch Watcher lacks equivalent capability, creating inconsistency in multi-system integrations.

Summary

Adding keystore variable interpolation to Watcher definitions is not an enhancement, but a critical enabler for secure, automated, and compliant use of Elasticsearch in enterprise alerting workflows.
It directly addresses a security exposure, removes GitOps blockers, and preserves the advanced query capabilities that make Watcher more powerful than Kibana alerts for data-driven operational intelligence.


References


Labels

enhancement, security, area:Watcher


Request

Please consider adding keystore-backed secret interpolation in Watcher definitions, covering headers and auth fields at minimum, with strong guarantees against secret leakage in stored objects and logs. Happy to test a design or prototype and provide feedback.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions