Skip to content

OpenTelemetry::Exporter::OTLP::Exporter should re-establish an HTTP connection on errorΒ #1658

@SophieDeBenedetto

Description

@SophieDeBenedetto

The Problem

The OpenTelemetry::Exporter::OTLP::Exporter#send_bytes method establishes a persistent HTTP connection and re-uses that connection when receiving certain error statuses back from the server and retrying the export request.

At GitHub, we observed that this can cause a pile-on effect impacting certain backend nodes (in our case, we have an OTel collector backend). Nodes that received "bad" requests or that are returning errors to the client for other reasons then continue to receive all the retries from a given request since the client is re-using the same persistent HTTP connection. The collector node would then be under increased pressure, and where the collector node was already under memory or CPU pressure, this would exacerbate the situation.

So, we introduced a monkey patch to the OTLP::Exporter to force it to create a new HTTP connection in the event of an error response. As a result, we saw a marked decrease in client exporter failure rates and OTel collector span refusal and drop rates, and we saw improvements in the distribution of memory usage across our fleet of OTel collector pods.

The Proposal

The OTLP::Exporter should close the current HTTP connection and open a new one when #send_bytes gets an error response back from the backend.

Implementation Suggestion

Our monkey patch looks like this:

def backoff?(retry_count:, reason:, retry_after: nil)
  @http.finish if @http.started?
  super
end

The #backoff? method is called before any call to #redo to retry the request in #send_bytes. So, the #backoff? method would be an appropriate place to close the HTTP connection. Then, the code already present in #send_bytes will start a fresh connection when #redo is called.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions