Skip to content

Conversation

Groxx
Copy link
Member

@Groxx Groxx commented Sep 4, 2025

Surprisingly easy to change all these actually, I should've done this a long time ago.
Unsurprisingly we were almost perfect in our use, but not completely.

s.container = &archiver.HistoryBootstrapContainer{
Logger: testlogger.New(s.T()),
MetricsClient: metrics.NewClient(scope, metrics.HistoryArchiverScope),
MetricsClient: metrics.NewClient(scope, metrics.History),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a ServiceIdx, not a ScopeIdx. as it's a test, I just updated it to something plausible.

scopeWithDomainTag.IncCounter(metrics.ElasticsearchFailuresPerDomain)
default:
p.logger.Error("Operation failed with internal error.", tag.MetricScope(scope), tag.Error(err))
p.logger.Error("Operation failed with internal error.", tag.MetricScope(int(scope)), tag.Error(err))
Copy link
Member Author

@Groxx Groxx Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few of these:
for some reason we're logging the scope as

{
  "metrics-scope": 123
}

which does not seem useful. this should probably be pulling out the operation name (which is only possible due to #7237 making them unique)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I find that a baffling use of logs for any other reason then a painful way to debug metrics

MaxCount: 100,
Pin: true,
MetricsScope: metricsClient.Scope(metrics.Frontend),
MetricsScope: metricsClient.Scope(metrics.PersistenceGetShardScope), // was metrics.Frontend, using incorrect int
Copy link
Member Author

@Groxx Groxx Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear incorrect value, was a ServiceIdx when it needed a ScopeIdx.

I've replaced it with the metric with the same operation (1) so metrics are unchanged, but this is obviously the wrong metric.

how do we use this? is there something that should be used instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given it's the async APIs, I imagine we've just not used it heavily yet

@Groxx Groxx mentioned this pull request Sep 4, 2025
Groxx added a commit that referenced this pull request Sep 4, 2025
Part cleanup due to discoveries in e.g. #7210, part prep-work for migrating metrics to almost literally any new setup (to make that easier and safer, by allowing int -> name lookups).

I *believe* this is a safe thing to do, as we don't seem to do anything fancy with the metric-def indexes.
But it's quite hard to verify because we don't have a unique type (fixed in #7238) and all the code here is rather value-oriented and fallback-y so it's hard to be totally confident that everything is handled.
Surprisingly easy to change all these actually, I should've done this a long time ago.
Unsurprisingly we were *almost* perfect in our use, but not completely.
@Groxx Groxx enabled auto-merge (squash) September 4, 2025 23:53
@Groxx Groxx disabled auto-merge September 5, 2025 01:24
logger.Error("Executor not found.", tag.Error(err), tag.MetricScope(int(scope))) // int???
default:
logger.Error("Store failed with internal error.", tag.Error(err), tag.MetricScope(scope))
logger.Error("Store failed with internal error.", tag.Error(err), tag.MetricScope(int(scope))) // int???
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... have no idea why we'd log this or, to be honest why we have a log tag for metric scopes at all.

@jakobht what was your intent?

Copy link
Member Author

@Groxx Groxx Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"scope" is the "operation", which seems worth logging. though it's frequently already included in the on-[thing] logger, like with the metrics tag

int scope number kinda seems like it might only be for debugging wrong-scope-int behavior bugs or something. which should now be solved with unique types and unique values.

@Groxx Groxx merged commit 5d74f8f into cadence-workflow:master Sep 5, 2025
30 checks passed
@Groxx Groxx deleted the metric-types branch September 5, 2025 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants