feat: enable dual emitting for task attempt and latency related metrics#7743
Conversation
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
Signed-off-by: Neil Xie <neil.xie@uber.com>
🔍 CI failure analysis for 9616919: Replication simulation test (activeactive_regional_failover) failed with workflow timeout, but this appears unrelated to the metrics instrumentation changes in this PR and is likely a flaky test.IssueThe replication simulation test
Root CauseThis failure appears unrelated to the code changes in this PR. The PR only adds metrics instrumentation:
DetailsThe modified files are:
This is likely a flaky test with timing sensitivities in the cross-cluster replication simulation environment. The metrics instrumentation should not affect:
Code Review ✅ Approved 2 resolved / 2 findingsBoth previous findings (bucket truncation and missing HandleErr dual-emit) have been resolved. The implementation follows the established dual-emit pattern with appropriate bucket configurations and complete config/test coverage. ✅ 2 resolved✅ Bug: Missing histogram dual-emit in HandleErr for attempt counts
✅ Edge Case: ExponentialTaskLatencyPerDomain may truncate long-lived tasks
Rules ❌ No requirements metRepository Rules
1 rule not applicable. Show all rules by commenting Tip Comment OptionsAuto-apply is off → Gitar will not commit updates to this branch. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
What changed?
Start dual emitting timer and histogram metrics for task latency and attempt metrics
#7741
Why?
Timer -> Histogram migration
How did you test it?
go test -v ./common/metrics
Potential risks
Metrics storage increase
Release notes
Documentation Changes
Reviewer Validation
PR Description Quality (check these before reviewing code):
go testinvocation)