-
Notifications
You must be signed in to change notification settings - Fork 860
Unique types for different kinds of metric indexes #7238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
s.container = &archiver.HistoryBootstrapContainer{ | ||
Logger: testlogger.New(s.T()), | ||
MetricsClient: metrics.NewClient(scope, metrics.HistoryArchiverScope), | ||
MetricsClient: metrics.NewClient(scope, metrics.History), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a ServiceIdx, not a ScopeIdx. as it's a test, I just updated it to something plausible.
scopeWithDomainTag.IncCounter(metrics.ElasticsearchFailuresPerDomain) | ||
default: | ||
p.logger.Error("Operation failed with internal error.", tag.MetricScope(scope), tag.Error(err)) | ||
p.logger.Error("Operation failed with internal error.", tag.MetricScope(int(scope)), tag.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are a few of these:
for some reason we're logging the scope as
{
"metrics-scope": 123
}
which does not seem useful. this should probably be pulling out the operation name (which is only possible due to #7237 making them unique)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I find that a baffling use of logs for any other reason then a painful way to debug metrics
MaxCount: 100, | ||
Pin: true, | ||
MetricsScope: metricsClient.Scope(metrics.Frontend), | ||
MetricsScope: metricsClient.Scope(metrics.PersistenceGetShardScope), // was metrics.Frontend, using incorrect int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clear incorrect value, was a ServiceIdx when it needed a ScopeIdx.
I've replaced it with the metric with the same operation (1
) so metrics are unchanged, but this is obviously the wrong metric.
how do we use this? is there something that should be used instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given it's the async APIs, I imagine we've just not used it heavily yet
Part cleanup due to discoveries in e.g. #7210, part prep-work for migrating metrics to almost literally any new setup (to make that easier and safer, by allowing int -> name lookups). I *believe* this is a safe thing to do, as we don't seem to do anything fancy with the metric-def indexes. But it's quite hard to verify because we don't have a unique type (fixed in #7238) and all the code here is rather value-oriented and fallback-y so it's hard to be totally confident that everything is handled.
Surprisingly easy to change all these actually, I should've done this a long time ago. Unsurprisingly we were *almost* perfect in our use, but not completely.
logger.Error("Executor not found.", tag.Error(err), tag.MetricScope(int(scope))) // int??? | ||
default: | ||
logger.Error("Store failed with internal error.", tag.Error(err), tag.MetricScope(scope)) | ||
logger.Error("Store failed with internal error.", tag.Error(err), tag.MetricScope(int(scope))) // int??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I... have no idea why we'd log this or, to be honest why we have a log tag for metric scopes at all.
@jakobht what was your intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"scope" is the "operation", which seems worth logging. though it's frequently already included in the on-[thing] logger, like with the metrics tag
int scope number kinda seems like it might only be for debugging wrong-scope-int behavior bugs or something. which should now be solved with unique types and unique values.
Surprisingly easy to change all these actually, I should've done this a long time ago.
Unsurprisingly we were almost perfect in our use, but not completely.