Skip to content

enhance otel export logging #868

@dconnolly-sfdc

Description

@dconnolly-sfdc

Description

Recently, we migrated from the newrelic format to otel.
A few of our busier sites had issues where metrics were being 'silently' dropped without anything showing in debug logs.

Steps to Reproduce

docker run -d --name ktranslate-otel --restart unless-stopped --user id -u:id -g -v /opt/newrelic-npm/snmp.yaml:/snmp-base.yaml -e AWS_ACCESS_KEY_ID=123 -e AWS_SECRET_ACCESS_KEY=456 -e AWS_REGION=region -e "OTEL_METRIC_EXPORT_INTERVAL=30000" -e "OTEL_EXPORTER_OTLP_COMPRESSION=gzip" store.sfdcbt.net/kentik/ktranslate:kt-2025-10-31-18983550174 -format=otel -format_metric=otel -otel.protocol=http -otel.endpoint=https://otel-collector.url:4318 -snmp /snmp-base.yaml -log_level=debug -metrics=jchf -snmp_discovery_on_start=true -snmp_discovery_min=180 -service_name=debug-otel

Tried so far

Adding these config's fixed issues for many sites but not all

  • Increased memory for the container
  • Added these env var's:
    OTEL_EXPORTER_OTLP_COMPRESSION=gzip
    OTEL_METRIC_EXPORT_INTERVAL=30000

Proposal

Would love to see some debug logging for when internal otel export limits are hit.
e.g. otel.go -> line 270 and 300
replace
f.inputs[m.Name] <- m
with

ch := f.inputs[m.Name]
queueDepth := len(ch)
if queueDepth >= CHAN_SLACK {
	f.Debugf("Channel queue at CHAN_SLACK limit for %s: %d/%d (100%%)", m.Name, queueDepth, CHAN_SLACK)
}
ch <- m

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions