Skip to content

[service/telemetry] Add log_body_size metric to Universal Telemetry #14812

@dpaasman00

Description

@dpaasman00

Component(s)

No response

Is your feature request related to a problem? Please describe.

The Component Universal Telemetry mechanism provides *.size metrics that measure the OTLP protobuf wire-format size of data at each component boundary. While this is useful for understanding the total size of data as it flows through the collector and over the network, it doesn't provide a good understanding for the log volume collected by the collector. The current size metric represents the size of the log data packaged in OTLP, not the size of the log data itself.

Having a measurement more inline with the size of the underlying log data is crucial for capacity planning, billing correlation, and data reduction analysis.

Describe the solution you'd like

Add a new set of auto-instrumented metrics that measure the byte size of log record bodies at each component boundary. This new metric would be disabled by default and gated behind a detailed telemetry level. It could also be gated behind a new feature gate, but it will already be gated by the telemetry.newPipelineTelemetry feature gate used to enable universal telemetry. The metric would only be emitted by components processing logs.

Metric Name Description
otelcol.receiver.produced.log_body_size Total byte size of log bodies emitted by a receiver
otelcol.processor.consumed.log_body_size Total byte size of log bodies entering a processor
otelcol.processor.produced.log_body_size Total byte size of log bodies leaving a processor
otelcol.connector.consumed.log_body_size Total byte size of log bodies entering a connector
otelcol.connector.produced.log_body_size Total byte size of log bodies leaving a connector
otelcol.exporter.consumed.log_body_size Total byte size of log bodies entering an exporter

For each plog.Logs batch, we would iterate over log records and sum body sizes:

func logBodySize(ld plog.Logs) int {
    total := 0
    for i := 0; i < ld.ResourceLogs().Len(); i++ {
        rl := ld.ResourceLogs().At(i)
        for j := 0; j < rl.ScopeLogs().Len(); j++ {
            sl := rl.ScopeLogs().At(j)
            for k := 0; k < sl.LogRecords().Len(); k++ {
                lr := sl.LogRecords().At(k)
                body := lr.Body().AsString()
                total += int64(len(body))
            }
        }
    }
    return total
}

Describe alternatives you've considered

No response

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    collector-telemetryhealthchecker and other telemetry collection issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions