Skip to content

Commit 6426c38

Browse files
authored
fix(http/prom): record bodies when eos reached (#3856)
* chore(app/outbound): `linkerd-mock-http-body` test dependency this adds a development dependency, so we can use this mock body type in the outbound proxy's unit tests. Signed-off-by: katelyn martin <[email protected]> * chore(app/outbound): additional http route metrics tests Signed-off-by: katelyn martin <[email protected]> * chore(app/outbound): additional grpc route metrics tests Signed-off-by: katelyn martin <[email protected]> * fix(http/prom): record bodies when eos reached this commit fixes a bug discovered by @alpeb, which was introduced in proxy v2.288.0. > The associated metric is `outbound_http_route_request_statuses_total`: > > ``` > $ linkerd dg proxy-metrics -n booksapp deploy/webapp|rg outbound_http_route_request_statuses_total.*authors > outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="204",error=""} 5 > outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="201",error="UNKNOWN"} 5 > outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="200",error="UNKNOWN"} 10 > ``` > > The problem was introduced in `edge-25.3.4`, with the proxy `v2.288.0`. > Before that the metrics looked like: > > ``` > $ linkerd dg proxy-metrics -n booksapp deploy/webapp|rg outbound_http_route_request_statuses_total.*authors > outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="200",error=""} 193 > outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="204",error=""} 96 > outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="201",error=""} 96 > ``` > > So the difference is the non-empty value for `error=UNKNOWN` even > when `https_status` is 2xx, which `linkerd viz stat-outbound` > interprets as failed requests. in #3086 we introduced a suite of route- and backend-level metrics. that subsystem contains a body middleware that will report itself as having reached the end-of-stream by delegating directly down to its inner body's `is_end_stream()` hint. this is roughly correct, but is slightly distinct from the actual invariant: a `linkerd_http_prom::record_response::ResponseBody<B>` must call its `end_stream` helper to classify the outcome and increment the corresponding time series in the `outbound_http_route_request_statuses_total` metric family. in #3504 we upgraded our hyper dependency. while doing so, we neglected to include a call to `end_stream` if a data frame is yielded and the inner body reports itself as having reached the end-of-stream. this meant that instrumented bodies would be polled until the end is reached, but were being dropped before a `None` was encountered. this commit fixes this issue in two ways, to be defensive: * invoke `end_stream()` if a non-trailers frame is yielded, and the inner body now reports itself as having ended. this restores the behavior in place prior to #3504. see the relevant component of that diff, here: <https://github.com/linkerd/linkerd2-proxy/pull/3504/files#diff-45d0bc344f76c111551a8eaf5d3f0e0c22ee6e6836a626e46402a6ae3cbc0035L262-R274> * rather than delegating to the inner `<B as Body>::is_end_stream()` method, report the end-of-stream being reached by inspecting whether or not the inner response state has been taken. this is the state that directly indicates whether or not the `ResponseBody<B>` middleware is finished. X-ref: #3504 X-ref: #3086 X-ref: linkerd/linkerd2#8733 Signed-off-by: katelyn martin <[email protected]> --------- Signed-off-by: katelyn martin <[email protected]>
1 parent 985580f commit 6426c38

File tree

4 files changed

+372
-1
lines changed

4 files changed

+372
-1
lines changed

Cargo.lock

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1675,6 +1675,7 @@ dependencies = [
16751675
"linkerd-io",
16761676
"linkerd-meshtls",
16771677
"linkerd-meshtls-rustls",
1678+
"linkerd-mock-http-body",
16781679
"linkerd-opaq-route",
16791680
"linkerd-proxy-client-policy",
16801681
"linkerd-retry",

linkerd/app/outbound/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ linkerd-meshtls = { path = "../../meshtls", features = ["rustls"] }
7171
linkerd-meshtls-rustls = { path = "../../meshtls/rustls", features = [
7272
"test-util",
7373
] }
74+
linkerd-mock-http-body = { path = "../../mock/http-body" }
7475
linkerd-stack = { path = "../../stack", features = ["test-util"] }
7576
linkerd-tracing = { path = "../../tracing", features = ["ansi"] }
7677

0 commit comments

Comments
 (0)