Skip to content

Conversation

@timmilesdw
Copy link
Contributor

Overview

Replaced binary compaction metrics with more informative metrics that show actual compaction frequency and queue load by hook.

What this PR does / why we need it

Old metrics (removed):

  • tasks_queue_compaction_in_queue_tasks{queue_name, task_id} - showed task count per compaction ID
  • tasks_queue_compaction_reached{queue_name, task_id} - showed only 0 or 1 when reaching cap threshold

These metrics had limited value as they only indicated whether compaction threshold was reached, without showing actual queue load or compaction activity.

New metrics (added):

  • tasks_queue_compaction_operations_total{queue_name, hook} - counter showing how many times compaction ran for each hook
  • tasks_queue_compaction_tasks_by_hook{queue_name, hook} - gauge showing actual number of pending tasks per hook (when >20)

Why this is better:

  1. Visibility into compaction frequency: The counter shows if specific hooks trigger excessive compaction operations, indicating potential performance issues
  2. Real queue load: The gauge shows actual task count (not just binary 0/1), making it possible to track queue pressure and set meaningful alerts
  3. Better grouping: Metrics grouped by hook name instead of task_id, making it easy to identify problematic hooks
  4. Reliable values: Metrics updated every 10 seconds via queue snapshot, always showing accurate current state

Special notes for your reviewer

  • Breaking change: old metrics removed, dashboards need updates
  • New metrics provide better observability for queue performance monitoring

@timmilesdw timmilesdw added the enhancement New feature or request label Nov 24, 2025
@timmilesdw timmilesdw self-assigned this Nov 24, 2025
Signed-off-by: Timur Tuktamyshev <[email protected]>
timmilesdw and others added 2 commits November 25, 2025 13:49
Signed-off-by: Timur Tuktamyshev <[email protected]>
Signed-off-by: Pavel Okhlopkov <[email protected]>
@ldmonster ldmonster changed the title chore: bump shell operator [addon-operator] chore: bump shell operator Nov 26, 2025
@ldmonster ldmonster merged commit 6967f86 into main Nov 26, 2025
8 of 9 checks passed
@ldmonster ldmonster deleted the fix/metrics branch November 26, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants