Skip to content

Duplicate metrics "One or more points were written more frequently than the maximum sampling period configured for the metric." #360

@firehg

Description

@firehg

I deployed OpenTelemetry collector sidecar on Cloud Run with my go services, followed a combination of https://cloud.google.com/run/docs/tutorials/custom-metrics-opentelemetry-sidecar and https://cloud.google.com/stackdriver/docs/instrumentation/opentelemetry-collector-cloud-run.

Now I'm seeing a lot of otelcol_exporter_send_failed_metric_points_total, which highly correlates with this log message, e.g.

{"service.instance.id": "7dfc51ac-42d9-4f69-8f56-baa17c5cde63", "service.name": "otelcol-google", "service.version": "0.128.0"}, "otelcol.component.id": "googlemanagedprometheus", "otelcol.component.kind": "exporter", "otelcol.signal": "metrics", "error": "rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: timeSeries[0-37] (example metric.type=\"prometheus.googleapis.com/otelcol_processor_outgoing_items_total/counter\", metric.labels={\"otel_scope_name\": \"go.opentelemetry.io/collector/processor/processorhelper\", \"otel_signal\": \"metrics\", \"processor\": \"memory_limiter\", \"otel_scope_version\": \"\"}): write for resource=prometheus_target{job:otelcol-google,cluster:__run__,location:us-central1,instance:0069c7a988553fb09ee223f351cf533dd701f7998934c73fbeb3508cd65e77be20da697145fab3b44c65072d0f5d6056373db26ab4e29f4c646e5da17617d44860317b6624af1c728380955e,namespace:} failed with: One or more points were written more frequently than the maximum sampling period configured for the metric. {Metric: prometheus.googleapis.com/otel_scope_info/gauge, Timestamps: {Youngest Existing: '2025/07/01-12:49:32.018', New: '2025/07/01-12:49:32.018'}}\nerror details: name = Unknown  desc = total_point_count:38 success_point_count:36 errors:{status:{code:9} point_count:2}", "dropped_items": 30}

The Youngest Existing timestamp exactly matches the New timestamp. (Sometimes the New timestamp could be a couple milliseconds newer.)

I manually went through some log messages and only saw this happened for 2 metrics:

  • prometheus.googleapis.com/otel_scope_info/gauge
  • prometheus.googleapis.com/target_info/gauge

The main impact is, with a constant increments of otelcol_exporter_send_failed_metric_points_total, I'm not sure whether I'm losing any useful data points.

Collector configs (click to expand)
receivers:
  # Open two OTLP servers:
  # - On port 4317, open an OTLP GRPC server
  # - On port 4318, open an OTLP HTTP server
  #
  # Docs:
  # https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/otlpreceiver
  otlp:
    protocols:
      grpc:
        endpoint: localhost:4317
      http:
        cors:
          # This effectively allows any origin
          # to make requests to the HTTP server.
          allowed_origins:
          - http://*
          - https://*
        endpoint: localhost:4318

processors:
  # The batch processor is in place to regulate both the number of requests
  # being made and the size of those requests.
  #
  # Docs:
  # https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/batchprocessor
  batch:
    # batch metrics before sending to reduce API usage
    # Configured to batch telemetry requests at the Google Cloud maximum number of entries per request, or at the Google Cloud minimum interval of every 5 seconds (whichever comes first).
    send_batch_max_size: 200
    send_batch_size: 200
    timeout: 5s

  # The memorylimiter will check the memory usage of the collector process.
  #
  # Docs:
  # https://github.com/open-telemetry/opentelemetry-collector/tree/main/processor/memorylimiterprocessor
  memory_limiter:
    # drop metrics if memory usage gets too high
    check_interval: 1s
    limit_percentage: 65
    spike_limit_percentage: 20

  # The resourcedetection processor is configured to detect GCP resources.
  # Resource attributes that represent the GCP resource the collector is
  # running on will be attached to all telemetry that goes through this
  # processor.
  #
  # Docs:
  # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor
  # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/resourcedetectionprocessor#gcp-metadata
  # automatically detect Cloud Run resource metadata
  resourcedetection:
    detectors: [env, gcp]
    timeout: 2s
    override: false

  # The transform/collision processor ensures that any attributes that may
  # collide with the googlemanagedprometheus exporter's monitored resource
  # construction are moved to a similar name that is not reserved.
  transform/collision:
    metric_statements:
    - context: datapoint
      statements:
      - set(attributes["exported_location"], attributes["location"])
      - delete_key(attributes, "location")
      - set(attributes["exported_cluster"], attributes["cluster"])
      - delete_key(attributes, "cluster")
      - set(attributes["exported_namespace"], attributes["namespace"])
      - delete_key(attributes, "namespace")
      - set(attributes["exported_job"], attributes["job"])
      - delete_key(attributes, "job")
      - set(attributes["exported_instance"], attributes["instance"])
      - delete_key(attributes, "instance")
      - set(attributes["exported_project_id"], attributes["project_id"])
      - delete_key(attributes, "project_id")

  resource:
    attributes:
    # add instance_id as a resource attribute
    - key: service.instance.id
      from_attribute: faas.id
      action: upsert
    # parse service name from K_SERVICE Cloud Run variable
    - key: service.name
      value: ${env:K_SERVICE}
      action: insert

exporters:
  # The googlemanagedprometheus exporter will send metrics to
  # Google Managed Service for Prometheus.
  #
  # Docs:
  # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/googlemanagedprometheusexporter
  googlemanagedprometheus: # Note: this is intentionally left blank

extensions:
  # Opens an endpoint on 13133 that can be used to check the
  # status of the collector. Since this does not configure the
  # `path` config value, the endpoint will default to `/`.
  #
  # When running on Cloud Run, this extension is required and not optional.
  # In other environments it is recommended but may not be required for operation
  # (i.e. in Container-Optimized OS or other GCE environments).
  #
  # Docs:
  # https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/healthcheckextension
  health_check:
    endpoint: 0.0.0.0:13133
service:
  extensions:
  - health_check
  pipelines:
    metrics/otlp:
      receivers:
      - otlp
      processors:
      - transform/collision
      - resourcedetection
      - memory_limiter
      - batch
      - resource
      exporters:
      - googlemanagedprometheus
  # Internal telemetry for the collector supports both push and pull-based telemetry data transmission.
  # Leveraging the pre-configured OTLP receiver eliminates the need for an additional port.
  #
  # Docs:
  # https://opentelemetry.io/docs/collector/internal-telemetry/
  telemetry:
    metrics:
      readers:
        - periodic:
            exporter:
              otlp:
                protocol: grpc
                endpoint: localhost:4317

Other information:

  • Image: us-docker.pkg.dev/cloud-ops-agents-artifacts/google-cloud-opentelemetry-collector/otelcol-google:0.128.0
  • CPU: 0.1
  • Memory: 128MB

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions