Skip to content

Commit 777669e

Browse files
authored
Fix trimmed labels being used in other metrics, due to shallow copy (#4588)
This addresses the errors on b/384820142 The main issue is that, in _MetricsRecorder, trimmed_labels was a shallow copy for self._labels, which had job removed, thus causing many utask duration with different jobs to reduce to the same labels (deduplication seems to only take into account the labels declared in monitoring_metrics.py) The other error is related to saturation, where time series would be written too often. This probably happens due to the loss of the job label, causing many requests to a subset of the original labels. It can possibly go away once this lands, otherwise more investigation is needed.
1 parent 568cb55 commit 777669e

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

src/clusterfuzz/_internal/bot/tasks/utasks/__init__.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -177,10 +177,11 @@ def __exit__(self, _exc_type, _exc_value, _traceback):
177177
# Get rid of job as a label, so we can have another metric to make
178178
# error conditions more explicit, respecting the 30k distinct
179179
# labels limit recommended by gcp.
180-
trimmed_labels = self._labels
180+
trimmed_labels = {
181+
**self._labels, 'task_succeeded': task_succeeded,
182+
'error_condition': error_condition
183+
}
181184
del trimmed_labels['job']
182-
trimmed_labels['task_succeeded'] = task_succeeded
183-
trimmed_labels['error_condition'] = error_condition
184185
monitoring_metrics.TASK_OUTCOME_COUNT_BY_ERROR_TYPE.increment(
185186
trimmed_labels)
186187

0 commit comments

Comments
 (0)