-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Emit supervisorId dimension with streaming task metrics #18803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| @Override | ||
| protected ServiceMetricEvent.Builder getMetricBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is off here. super.getMetricBuilder() just returns metricBuilder, so what's happening in this method is redundant. It would be equivalent to do this in the constructor rather than getMetricBuilder. i.e. simply do metricBuilder.setDimension(DruidMetrics.SUPERVISOR_ID, supervisorId).
However, this highlights a problem: the metricBuilder is not thread safe, yet it appears to be used from multiple threads with the pattern emitter.emit(getMetricBuilder().setMetric(metric, value)). For example, as far as I can tell, ingest/segments/count is emitted in a push thread but certain other metrics are emitted in the main thread. It probably is called infrequently enough that this doesn't cause problems in practice. But it's still a thread safety bug and we should fix it.
How about redefining this method to return a new instance of ServiceMetricEvent.Builder each time it's called? That should fix it, and then this implementation you have here would be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I have updated TaskRealtimeMetricsMonitor accordingly.
| DruidMetrics.TASK_TYPE, new String[]{task.getType()}, | ||
| DruidMetrics.GROUP_ID, new String[]{task.getGroupId()} | ||
| ), | ||
| getMetricDimensions(task), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Map<String, String[]> dimensions on TaskRealtimeMetricsMonitor could be changed to ServiceMetricEvent.Builder, then it could be passed task.getMetricBuilder() and avoid the duplicated logic / instanceof check below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍🏻 .
|
Merging as the IT failure is due to the flaky test |
Description
With the support of multiple supervisors ingesting into the same datasource, it becomes difficult to distinguish the metrics originating from the tasks of two different supervisors.
Changes
supervisorIdto all metrics emitted by streaming tasksTaskRealtimeMetricsMonitorTaskRealtimeMetricsMonitorBuilderThis PR has: