Inconsistently-collected metrics #11950
Unanswered
zack-littke-smith-ai
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there! Newish Linkerd user here, so forgive me if these questions are common knowledge. When investigating the collected values of some core metrics, I became somewhat suspicious of them (I don't want to be suspicious of my metrics!). Let me be more specific:
response_total
androute_response_total
collectstatus_code
(which is strange because this is a gRPC service only) but only sometimes collectgrpc_status_code
, even though they seem to collectgrpc_status
response_total
collect 5XX connection errors whileroute_response_total
does not?status_code: 200
when they also reportclassification: failure
? Which of these should I trust?route_actual_response_total
vsroute_response_total
?)route_actual_response_total
hilariously only has three results across the entire internet. What does "Total count of actual route HTTP responses." mean??These issues are with a bread-and-butter metric that I'd really prefer to trust, and in light of these issues I am considering instrumenting our services instead of using linkerd's metrics. Am I being too rash?
Beta Was this translation helpful? Give feedback.
All reactions