Skip to content

Conversation

@velimir
Copy link

@velimir velimir commented Dec 3, 2025

Cowboy does not validate the :scheme pseudo-header per RFC 9113, which
states that scheme "is not restricted to http and https" and can be used
for non-HTTP schemes via proxies/gateways. This means Cowboy passes through
any value from the HTTP/2 HEADERS frame without validation.

When a malformed or unexpected scheme value reaches extract_scheme/2,
the case clause crashes because it only handles <<"http">> and
<<"https">>. This crash causes Erlang's telemetry library to permanently
detach the handler for the affected node, resulting in all subsequent
requests losing OpenTelemetry tracing until the node is restarted.

In production, this manifests as "almost all" requests suddenly missing
trace IDs, with the root cause being a single malformed HTTP/2 request
that crashed the telemetry handler.

The fix introduces a configurable valid_schemes option that maps scheme
binaries to atoms. Unknown schemes now return undefined instead of
crashing, which is consistent with how otel_http:extract_scheme/2 already
handles unknown schemes from headers.

The default configuration preserves backwards compatibility by only
accepting http and https. Users who need to support additional schemes
(e.g., ws, wss) can extend the map via configuration.

Fixes: #567

This to fix the following compilation error on test:

```
$ rebar3 ct
===> Verifying dependencies...
===> Analyzing applications...
===> Compiling opentelemetry
===> Compiling _build/test/lib/opentelemetry/src/otel_span_utils.erl failed
    ┌─ _build/test/lib/opentelemetry/src/otel_span_utils.erl:
    │
 92 │              {ParentSpanCtx#span_ctx{span_id=SpanId, hex_span_id=HexSpanId}, ParentSpanId, ParentIsRemote}
    │                                                      ╰── field hex_span_id undefined in record span_ctx

     ┌─ _build/test/lib/opentelemetry/src/otel_span_utils.erl:
     │
 101 │                hex_trace_id=HexTraceId,
     │                ╰── field hex_trace_id undefined in record span_ctx

     ┌─ _build/test/lib/opentelemetry/src/otel_span_utils.erl:
     │
 103 │                hex_span_id=HexSpanId,
     │                ╰── field hex_span_id undefined in record span_ctx

    ┌─ _build/test/lib/opentelemetry/src/otel_span_utils.erl:
    │
 91 │              HexSpanId = otel_utils:encode_hex(<<SpanId:64>>),
    │              ╰── Warning: variable 'HexSpanId' is unused

    ┌─ _build/test/lib/opentelemetry/src/otel_span_utils.erl:
    │
 97 │      HexTraceId = otel_utils:encode_hex(<<TraceId:128>>),
    │      ╰── Warning: variable 'HexTraceId' is unused

    ┌─ _build/test/lib/opentelemetry/src/otel_span_utils.erl:
    │
 99 │      HexSpanId = otel_utils:encode_hex(<<SpanId:64>>),
    │      ╰── Warning: variable 'HexSpanId' is unused
```

this is necessary as there are new fields got introduced in the the open-telemetry/opentelemetry-erlang#874
… HTTP/2 scheme

Cowboy does not validate the `:scheme` pseudo-header per RFC 9113, which
states that scheme "is not restricted to http and https" and can be used
for non-HTTP schemes via proxies/gateways. This means Cowboy passes through
any value from the HTTP/2 HEADERS frame without validation.

When a malformed or unexpected scheme value reaches `extract_scheme/2`,
the case clause crashes because it only handles `<<"http">>` and
`<<"https">>`. This crash causes Erlang's telemetry library to permanently
detach the handler for the affected node, resulting in all subsequent
requests losing OpenTelemetry tracing until the node is restarted.

In production, this manifests as "almost all" requests suddenly missing
trace IDs, with the root cause being a single malformed HTTP/2 request
that crashed the telemetry handler.

The fix introduces a configurable `valid_schemes` option that maps scheme
binaries to atoms. Unknown schemes now return `undefined` instead of
crashing, which is consistent with how `otel_http:extract_scheme/2` already
handles unknown schemes from headers.

The default configuration preserves backwards compatibility by only
accepting `http` and `https`. Users who need to support additional schemes
(e.g., `ws`, `wss`) can extend the map via configuration.

Reproduction repository: https://github.com/velimir/scheme_crash

Fixes: open-telemetry#567
Copilot AI review requested due to automatic review settings December 3, 2025 10:52
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 3, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a crash in the OpenTelemetry Cowboy instrumentation when receiving malformed HTTP/2 scheme values. The crash would cause the telemetry handler to permanently detach, resulting in missing trace IDs for all subsequent requests until node restart.

Key Changes:

  • Introduced configurable valid_schemes option to map scheme binaries to atoms
  • Refactored extract_scheme/2 to return undefined for unknown schemes instead of crashing
  • Added test case to verify invalid schemes return undefined

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File Description
instrumentation/opentelemetry_cowboy/src/opentelemetry_cowboy.erl Added valid_schemes configuration option and refactored scheme extraction logic to handle invalid schemes gracefully
instrumentation/opentelemetry_cowboy/test/opentelemetry_cowboy_SUITE.erl Added test case invalid_scheme_returns_undefined to verify behavior with invalid schemes
instrumentation/opentelemetry_cowboy/rebar.lock Updated opentelemetry_api dependency from 1.4.0 to 1.5.0

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +512 to +527
invalid_scheme_returns_undefined(_Config) ->
Opts = #{
valid_schemes => #{<<"https">> => https}
},
opentelemetry_cowboy:setup(Opts),
Port = 62662,
{ok, {{_Version, 200, _ReasonPhrase}, _Headers, _Body}} =
httpc:request(get, {"http://localhost:8080/success", []}, [], [{socket_opts, [{port, Port}]}]),
receive
{span, #span{name=Name,attributes=Attributes,kind=Kind}} ->
?assertEqual('GET', Name),
?assertEqual(?SPAN_KIND_SERVER, Kind),
?assertEqual(undefined, maps:get(?URL_SCHEME, otel_attributes:map(Attributes)))
after
1000 -> ct:fail(invalid_scheme_returns_undefined)
end.
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test configures valid_schemes to only accept https, then makes an http:// request. While this correctly tests the scenario of an invalid scheme returning undefined, the test doesn't verify that the handler doesn't crash, which was the original bug being fixed. Consider adding a second request after the first to ensure the telemetry handler is still attached and functioning after encountering an invalid scheme.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to, but making an http2 request via httpc is.. problematic.

@velimir velimir closed this Dec 3, 2025
@tsloughter
Copy link
Member

Why was this closed?

@velimir
Copy link
Author

velimir commented Dec 3, 2025

Why was this closed?

GitHub automation in a private fork :/ it should've not been closed

@velimir velimir reopened this Dec 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opentelemetry_cowboy crashes and detaches on malformed HTTP/2 :scheme pseudo-header

2 participants