Skip to content

[telegraf] multiple issues with TOML generation #758

@danielskowronski

Description

@danielskowronski

Telegraf chart 1.8.65 still produces invalid TOML, which is a show-stopper, because there's no way to provide raw config and Telegraf refuses to start.

There are two issues I encountered so far:

  1. ping.Ping.Timeout by default is unconditionally cast into integer, while float64 is required; this is solved by using experimental --set tplVersion=2
    this should be easy to back-port to v1 template
  2. mapping keys are never quoted, which is especially problematic for processors.enum.mapping.value_mappings when key contain dot or start with number, for example FQDN or IP address (e.g. 1.1.1.1, example.com)

For the second issue, I have limited exposre to TOML and Telegraf, but it seems like keys in mapping should always be strings, so uncoditionally quoting them should not introduce any issues. Template helpers for v2 have multiple variants of {{ $k }} = {{ $v | quote }}, which sould be {{ $k | quote }} = ....

My proposal to what should be fixed:

  1. restore ability to provide raw config file - either as string or as reference to external ConfigMap/Secret; this would immediately lower severity of this bug report and make whole chart more flexible
  2. fix ping.Ping.Timeout casting in v1 or mention v2 workaround in documentation
  3. provide way to configure mappings with keys that require quoting in TOML (starting with numbers, containing dots), at least in v2 - it seems like blidly quoting shouldn't be an issue

Example that does not work. Please note that for clarity only single ping target is defined here, my real use-case has tens of them and that's why I don't want to define target tags in inputs stanza.

For below values.yaml:

---
# https://artifacthub.io/packages/helm/influxdata/telegraf?modal=values
replicaCount: 1

config:
  agent:
    interval: "1s"
    flush_interval: "5s"
    metric_batch_size: 1000
    metric_buffer_limit: 10000
    round_interval: false
    omit_hostname: true

  inputs:
    - ping:
        interval: "30s"
        urls:
          - "1.1.1.1"
        count: 1
        timeout: 1.1
        method: "native"
        name_override: "ping"

  processors:
    - enum:
        mapping:
          field: "url"
          dest: "target"
          value_mappings:
            "1.1.1.1": "cloudflare"

service:
  enabled: false
rbac:
  create: false
serviceAccount:
  create: false

this is what lands in ConfigMap as telegraf.conf:

[agent]
  collection_jitter = "0s"
  debug = false
  flush_interval = "5s"
  flush_jitter = "0s"
  hostname = "$HOSTNAME"
  interval = "1s"
  logfile = ""
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  omit_hostname = true
  precision = ""
  quiet = false
  round_interval = false
[[processors.enum]]
   [[processors.enum.mapping]]
    dest = "target"
    field = "url"
    [processors.enum.mapping.value_mappings]
        1.1.1.1 = "cloudflare"


[[outputs.influxdb]]
  database = "telegraf"
  urls = [
    "http://influxdb.monitoring.svc:8086"
  ]

[[inputs.ping]]
  count = 1
  interval = "30s"
  method = "native"
  name_override = "ping"
  timeout = 1
  urls = [
    "1.1.1.1"
  ]

[[inputs.internal]]
  collect_memstats = false

Pod starts with error:

2025-12-29T14:36:25Z I! Loading config: /etc/telegraf/telegraf.conf
2025-12-29T14:36:25Z E! loading config file /etc/telegraf/telegraf.conf failed: error parsing data: line 21: invalid TOML syntax

After setting tplVersion=2, config is rendered exactly the same, except for extra empty lines between sections and inputs.ping.timeout changed from 1 to 1.1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions